Distributed Systems Engineer for AI Cluster
3 days ago
A leading AI technology company in Toronto is seeking a Software Engineer to innovate in developing their massive AI supercomputers. Responsibilities include automating configuration of networking and OS, creating monitoring systems, and developing orchestration tools for resource management. Ideal candidates will have a strong background in software architecture and experience with distributed systems, Kubernetes, and monitoring tools. Enjoy a vibrant work culture focused on groundbreaking AI research and development.#J-18808-Ljbffr
-
Distributed Systems Engineer for AI Cluster
5 days ago
Toronto, Canada Cerebras Systems Inc. Full timeA leading AI technology company in Toronto is seeking a Software Engineer to innovate in developing their massive AI supercomputers. Responsibilities include automating configuration of networking and OS, creating monitoring systems, and developing orchestration tools for resource management. Ideal candidates will have a strong background in software...
-
AI Cluster Networking Architect
4 weeks ago
Toronto, Canada Cerebras Systems Full timeA leading AI technology company in Toronto is seeking a Network Engineer for its Cluster Architecture Team. The role focuses on designing efficient AI/ML and HPC clusters, involved in multiple projects and collaboration across teams. Candidates should have a Ph.D. or Master's with relevant industry experience, and possess skills in network design and...
-
AI Cluster Networking Architect
4 weeks ago
Toronto, Canada Cerebras Systems Full timeA leading AI technology company in Toronto is seeking a Network Engineer for its Cluster Architecture Team. The role focuses on designing efficient AI/ML and HPC clusters, involved in multiple projects and collaboration across teams. Candidates should have a Ph.D. or Master's with relevant industry experience, and possess skills in network design and...
-
Network Engineer
4 weeks ago
Toronto, Canada Cerebras Systems Full timeCerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to...
-
Network Engineer
4 weeks ago
Toronto, Canada Cerebras Systems Full timeCerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to...
-
Distributed Software Engineer
5 days ago
Toronto, Canada Cerebras Systems Inc. Full timeAbout The Role Cerebras Systems is a pioneer in large-scale AI Supercomputers. These multi-exaflop supercomputers are deployed in some of the biggest datacenters. These supercomputers are built using our Wafer-Scale Cluster technology – a cluster of several Wafer Scale Engine (WSE) chips. The Cluster engineering team is responsible for delivering software...
-
Distributed Software Engineer
3 days ago
Toronto, Canada Cerebras Systems Inc. Full timeAbout The Role Cerebras Systems is a pioneer in large-scale AI Supercomputers. These multi-exaflop supercomputers are deployed in some of the biggest datacenters. These supercomputers are built using our Wafer-Scale Cluster technology – a cluster of several Wafer Scale Engine (WSE) chips. The Cluster engineering team is responsible for delivering software...
-
Senior SRE: AI/ML HPC Infra
3 weeks ago
Toronto, Canada Boson AI Full timeA technology-driven AI company is seeking a Site Reliability Engineer to manage and optimize their advanced GPU cluster in Toronto. You'll be engaged in planning, deployment, and operation of HPC infrastructure while working closely with engineering teams. Ideal candidates will have a strong foundation in Linux systems, Kubernetes, and significant experience...
-
Senior SRE: AI/ML HPC Infra
3 weeks ago
Toronto, Canada Boson AI Full timeA technology-driven AI company is seeking a Site Reliability Engineer to manage and optimize their advanced GPU cluster in Toronto. You'll be engaged in planning, deployment, and operation of HPC infrastructure while working closely with engineering teams. Ideal candidates will have a strong foundation in Linux systems, Kubernetes, and significant experience...
-
Principal Software Engineer
5 days ago
Toronto, Canada Latinx in AI (LXAI) Full timeW Principal Software Engineer - AI PlatformWorkday, Inc.Remote friendly (Canada, ON, Toronto Canada)WorldwideData Science About The Team The Workday AI Infrastructure and Operations team is seeking an energetic and determined Software Engineer to design, implement, and deliver highly scalable features for our AI Platform. As a member of this fast-paced...