ML DevOps Engineer
1 week ago
Get AI-powered advice on this job and more exclusive features. Direct message the job poster from Galent Role Overview In this role, you will deploy, modernize, and scale Machine Learning infrastructure on Kubernetes. You will ensure deployments adhere to enterprise security standards while supporting ML workloads across data ingestion, processing, validation, model training, distributed computation, monitoring, and model serving. You will work with a modern DevOps stack including Kubernetes, Docker, Databricks, Blobfuse, Terraform, Helm, GitHub Actions, and Saltstack , with the majority of infrastructure running on Azure Cloud . Key Responsibilities Deploy, upgrade, and maintain ML Kubernetes infrastructure in compliance with security standards. Build scalable systems for data pipelines, model training, and inference workloads. Implement and manage CI/CD pipelines for ML systems and platform components. Automate infrastructure provisioning using Terraform, Helm, and other IaC tools. Collaborate closely with software engineering, data engineering, and ML teams. Oversee cloud-based Linux systems (RedHat/Ubuntu) and container orchestration solutions. Develop scripts and tooling for environment automation, monitoring, and diagnostics. Troubleshoot system performance and infrastructure reliability issues. Maintain high-quality engineering documentation. Must-Have Qualifications 5+ years building automated, production-grade infrastructure. Strong experience with Kubernetes , Docker , and container orchestration. Hands-on experience with Terraform . Solid background in software engineering within development teams. Expertise with cloud platforms (preferably Azure , AWS acceptable). Linux administration experience in cloud environments (RedHat/Ubuntu). Proficiency with Git and version control workflows. Excellent communication and documentation skills. Bachelor’s degree in Computer Science or equivalent experience. Nice-to-Have Skills Understanding of IP networking, VPNs, DNS, load balancing & firewalls. Familiarity with monitoring and observability tools. Experience with automated testing frameworks. Strong system performance tuning and diagnostics capabilities. Expertise with Saltstack or similar configuration management tools. Knowledge of system-level architecture optimization. Seniority level Mid-Senior level Employment type Contract Job function Information Technology Industries IT Services and IT Consulting and Banking Referrals increase your chances of interviewing at Galent by 2x #J-18808-Ljbffr
-
Deployment DevOps Engineer
2 days ago
Toronto, Ontario, Canada Adaptive ML Full timeAbout the teamAdaptive ML is helping companies build singular generative AI experiences by democratizing the use of reinforcement learning. We are building the foundational technologies, tools, and products required for models to learn directly from users' interactions, and for models to self-critique and self-improve from simple written guidelines. Our...
-
Deployment DevOps Engineer
3 days ago
Toronto, Canada Adaptive ML Full timeAbout the team Adaptive ML is helping companies build singular generative AI experiences by democratizing the use of reinforcement learning. We are building the foundational technologies, tools, and products required for models to learn directly from users' interactions, and for models to self-critique and self-improve from simple written guidelines. Our...
-
Deployment DevOps Engineer
3 days ago
Toronto, Canada Adaptive ML Full timeAbout the team Adaptive ML is helping companies build singular generative AI experiences by democratizing the use of reinforcement learning. We are building the foundational technologies, tools, and products required for models to learn directly from users' interactions, and for models to self-critique and self-improve from simple written guidelines. Our...
-
Deployment DevOps Engineer
1 day ago
Toronto, Canada Adaptive ML Full timeAbout the team Adaptive ML is helping companies build singular generative AI experiences by democratizing the use of reinforcement learning. We are building the foundational technologies, tools, and products required for models to learn directly from users' interactions, and for models to self-critique and self-improve from simple written guidelines. Our...
-
Deployment DevOps Engineer
11 hours ago
Toronto, Ontario, Canada Adaptive ML Full timeAbout the team Adaptive ML is helping companies build singular generative AI experiences by democratizing the use of reinforcement learning. We are building the foundational technologies, tools, and products required for models to learn directly from users' interactions, and for models to self-critique and self-improve from simple written guidelines. Our...
-
Deployment DevOps Engineer
1 day ago
Toronto, Canada Adaptive ML Full timeAbout the team Adaptive ML is helping companies build singular generative AI experiences by democratizing the use of reinforcement learning . We are building the foundational technologies, tools, and products required for models to learn directly from users' interactions, and for models to self-critique and self-improve from simple written guidelines. Our...
-
ML DevOps Engineer
3 days ago
Toronto, Canada Galent Full timeGet AI-powered advice on this job and more exclusive features.Direct message the job poster from GalentRole OverviewIn this role, you will deploy, modernize, and scale Machine Learning infrastructure on Kubernetes. You will ensure deployments adhere to enterprise security standards while supporting ML workloads across data ingestion, processing, validation,...
-
ML DevOps Engineer
3 days ago
Toronto, Canada Galent Full timeGet AI-powered advice on this job and more exclusive features.Direct message the job poster from GalentRole OverviewIn this role, you will deploy, modernize, and scale Machine Learning infrastructure on Kubernetes. You will ensure deployments adhere to enterprise security standards while supporting ML workloads across data ingestion, processing, validation,...
-
Senior DevOps Engineer, ML Infrastructure
3 days ago
Toronto, Canada Serve Robotics Full timeAt Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliveries available to more people, and benefit local businesses. The Serve fleet has been delighting merchants, customers, and pedestrians along the way in Los...
-
ML DevOps Engineer
5 hours ago
Toronto, Ontario, Canada Galent Full timeRole OverviewIn this role, you will deploy, modernize, and scale Machine Learning infrastructure on Kubernetes. You will ensure deployments adhere to enterprise security standards while supporting ML workloads across data ingestion, processing, validation, model training, distributed computation, monitoring, and model serving.You will work with a modern...