Engineering Manager, Inference Platform

1 day ago

Canada Cerebras Full time

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer‑scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry‑leading training and inference speeds and empowers machine learning users to effortlessly run large‑scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. Cerebras' current customers include global corporations across multiple industries, national labs, and top‑tier healthcare systems. In January, we announced a multi‑year, multi‑million‑dollar partnership with Mayo Clinic, underscoring our commitment to transforming AI applications across various fields. In August, we launched Cerebras Inference, the fastest Generative AI inference solution in the world, over 10 times faster than GPU‑based hyperscale cloud inference services. Location: Toronto / Sunnyvale We're looking for a deeply technical, hands‑on engineering leader for our Inference Service Platform. You will lead a high performing team to tackle a critical challenge: scaling LLM inference on Cerebras’ advanced compute clusters and delivering a world‑class, on‑prem solution for enterprise customers. In this role, you’ll set the technical vision while staying close to the code, architecting highly reliable, low latency distributed systems. If you have proven expertise in distributed systems and scaling modern model‑serving frameworks, we want to hear from you. Responsibilities Provide hands‑on technical leadership, owning the technical vision and roadmap for the Cerebras Inference Platform, from internal scaling to on‑prem customer solutions. Lead the end‑to‑end development of distributed inference systems, including request routing, autoscaling, and resource orchestration on Cerebras' unique hardware. Drive a culture of operational excellence, guaranteeing platform reliability (>99.9% uptime), performance, and efficiency. Lead, mentor, and grow a high‑caliber team of engineers, fostering a culture of technical excellence and rapid execution. Productize the platform into an enterprise‑ready, on‑prem solution, collaborating closely with product, ops, and customer teams to ensure successful deployments. Skills & Qualifications Technical Leadership : 6+ years in high‑scale software engineering, with 3+ years leading distributed systems or ML infra teams; strong coding and review skills. Inference Expertise : Proven track record scaling LLM inference: optimizing latency (

Head of Inference Platform Engineering

4 weeks ago

, , Canada Cerebras Full time

A pioneering AI hardware company is seeking a technical engineering leader for their Inference Service Platform. The role involves leading a team to tackle scaling challenges for LLM inference while ensuring high reliability and performance. Ideal candidates should have strong experience in distributed systems, inference optimization, and technical...
Senior Software Engineer, AI Inference Platform

5 days ago

, , Canada Cerebras Full time

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer‑scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry‑leading training and inference speeds and empowers machine learning users...
Senior Platform Engineer — AI Inference Services

4 weeks ago

, , Canada Cerebras Full time

A leading AI technology company in Canada is seeking a Platform Software Engineer to develop key backend services for their Inference platform. The ideal candidate should have over 5 years of backend development experience with strong Python skills. Responsibilities include API design and maintenance, collaborating with cross-functional teams, and...
Senior Research Engineer

1 day ago

, , Canada Cerebras Full time

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer‑scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry‑leading training and inference speeds and empowers machine learning users...
Principal Engineer, AI Inference Reliability

1 day ago

, , Canada Cerebras Full time

About the team The Cerebras Inference team’s mission is to deliver the world’s most performant, secure, and reliable enterprise‑grade AI service. We build and operate large‑scale distributed systems that power AI inference at unprecedented speed and efficiency. Join us to help scale inference and accelerate AI. About the role We’re looking for a...
AI Inference Engineer — Open-Source Integrations

1 week ago

, , Canada Cerebras Full time

A leading AI technology company in Canada seeks an experienced software engineer to develop open-source libraries and applications for its innovative inference platform. The role involves collaborating with engineering teams and creating demo applications that showcase the platform's advantages. Candidates should have a degree in computer science, 4+ years...
GPU Cloud Platform Engineer

1 week ago

, , Canada Yotta Labs Full time

Join to apply for the GPU Cloud Platform Engineer role at Yotta Labs . About Yotta Labs Yotta Labs is pioneering the development of a Decentralized Operating System (DeOS) for AI workload orchestration at a planetary scale. Our mission is to democratize access to AI resources by aggregating geo-distributed GPUs, enabling high-performance computing for AI...
Edge AI Research Engineer

4 weeks ago

, , Canada EnCharge AI Full time

A cutting-edge AI technology company in Canada is seeking an experienced AI Research Engineer to optimize deep learning models for edge AI platforms. The successful candidate will work on model compression, quantization strategies, and inference techniques. A Master's or Ph.D. in a relevant field and strong expertise in deep learning and model optimization...
Senior Inference ML Engineer — Sparse Attention

1 day ago

, , Canada Cerebras Full time

A leading AI technology company is seeking a Senior Research Engineer to enhance inference models for its innovative hardware. The ideal candidate will possess advanced skills in Python or C++, along with significant experience in machine learning and AI technologies. This role involves designing and optimizing transformer architectures, leading research on...
Senior Engineering Manager, Core AI Platform

4 weeks ago

, , Canada The Resume Database Full time

Senior Engineering Manager, Core AI Platform November 10, 2025 Role Description At Dropbox, we believe in simplifying the way people work together. We provide a range of innovative cloud-based solutions to empower individuals and businesses to share, access, and collaborate on their files seamlessly. Engineering Managers are pivotal in shaping our mission of...

Americas

Europe

Asia / Oceania

Africa

Engineering Manager, Inference Platform