Distributed Systems Engineer for AI Cluster

3 days ago


Toronto, Canada Cerebras Systems Inc. Full time

A leading AI technology company in Toronto is seeking a Software Engineer to innovate in developing their massive AI supercomputers. Responsibilities include automating configuration of networking and OS, creating monitoring systems, and developing orchestration tools for resource management. Ideal candidates will have a strong background in software architecture and experience with distributed systems, Kubernetes, and monitoring tools. Enjoy a vibrant work culture focused on groundbreaking AI research and development.#J-18808-Ljbffr



  • Toronto, Canada Cerebras Systems Inc. Full time

    A leading AI technology company in Toronto is seeking a Software Engineer to innovate in developing their massive AI supercomputers. Responsibilities include automating configuration of networking and OS, creating monitoring systems, and developing orchestration tools for resource management. Ideal candidates will have a strong background in software...


  • Toronto, Canada Cerebras Systems Full time

    A leading AI technology company in Toronto is seeking a Network Engineer for its Cluster Architecture Team. The role focuses on designing efficient AI/ML and HPC clusters, involved in multiple projects and collaboration across teams. Candidates should have a Ph.D. or Master's with relevant industry experience, and possess skills in network design and...


  • Toronto, Canada Cerebras Systems Full time

    A leading AI technology company in Toronto is seeking a Network Engineer for its Cluster Architecture Team. The role focuses on designing efficient AI/ML and HPC clusters, involved in multiple projects and collaboration across teams. Candidates should have a Ph.D. or Master's with relevant industry experience, and possess skills in network design and...

  • Network Engineer

    4 weeks ago


    Toronto, Canada Cerebras Systems Full time

    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to...

  • Network Engineer

    4 weeks ago


    Toronto, Canada Cerebras Systems Full time

    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to...


  • Toronto, Canada Cerebras Systems Inc. Full time

    About The Role Cerebras Systems is a pioneer in large-scale AI Supercomputers. These multi-exaflop supercomputers are deployed in some of the biggest datacenters. These supercomputers are built using our Wafer-Scale Cluster technology – a cluster of several Wafer Scale Engine (WSE) chips. The Cluster engineering team is responsible for delivering software...


  • Toronto, Canada Cerebras Systems Inc. Full time

    About The Role Cerebras Systems is a pioneer in large-scale AI Supercomputers. These multi-exaflop supercomputers are deployed in some of the biggest datacenters. These supercomputers are built using our Wafer-Scale Cluster technology – a cluster of several Wafer Scale Engine (WSE) chips. The Cluster engineering team is responsible for delivering software...


  • Toronto, Canada Boson AI Full time

    A technology-driven AI company is seeking a Site Reliability Engineer to manage and optimize their advanced GPU cluster in Toronto. You'll be engaged in planning, deployment, and operation of HPC infrastructure while working closely with engineering teams. Ideal candidates will have a strong foundation in Linux systems, Kubernetes, and significant experience...


  • Toronto, Canada Boson AI Full time

    A technology-driven AI company is seeking a Site Reliability Engineer to manage and optimize their advanced GPU cluster in Toronto. You'll be engaged in planning, deployment, and operation of HPC infrastructure while working closely with engineering teams. Ideal candidates will have a strong foundation in Linux systems, Kubernetes, and significant experience...


  • Toronto, Canada Latinx in AI (LXAI) Full time

    W Principal Software Engineer - AI PlatformWorkday, Inc.Remote friendly (Canada, ON, Toronto Canada)WorldwideData Science About The Team The Workday AI Infrastructure and Operations team is seeking an energetic and determined Software Engineer to design, implement, and deliver highly scalable features for our AI Platform. As a member of this fast-paced...