Senior ML Infrastructure Engineer

7 days ago


Town of Oxford, Canada Ellison Institute of Technology Full time

At the Ellison Institute of Technology (EIT), we’re on a mission to translate scientific discovery into real world impact. We bring together visionary scientists, technologists, policy makers, and entrepreneurs to tackle humanity’s greatest challenges in four transformative areas: Health, Medical Science & Generative Biology Food Security & Sustainable Agriculture Climate Change & Managing CO₂ Artificial Intelligence & Robotics This is ambitious work - work that demands curiosity, courage, and a relentless drive to make a difference. At EIT, you’ll join a community built on excellence, innovation, tenacity, trust, and collaboration, where bold ideas become real-world breakthroughs. Together, we push boundaries, embrace complexity, and create solutions to scale ideas for lab to society. Explore more at Our MLOps team Join ourMLOpsteam to build the cloud and compute foundation that enables scientific breakthroughs. Deliver reliable, secure platforms and self-service guardrails that accelerate experimentation and turn ideas into results—faster, at scale, and with confidence. Day-to-day, you might: Build,operate, and continuously optimise our high-performance GPU training and inference clusters, focusing on robust, high-availability scheduling, isolation, and automated lifecycle management. Drive systems design and implementation for high-throughput data paths, optimising I/O, caching, and data locality across compute and storage (including our current Lustre implementation). Proactively benchmark, profile, and resolve performance bottlenecks across the compute, network, and orchestration layers to maximise efficiency for distributed training and inference. Establish comprehensive observability, resilience, and automated security controls to ensure compliance and robust operation of sensitive research environments. Partner with Research, Data, and Applied teams to forecast capacity and cost for GPU and storage needs, setting quotas and streamlining ML experimentation pipelines. What makes you a great fit: Proven experience leading the design, build, and operation of high-performance ML compute clusters at scale A proactive, autonomous approach to systems design and the proven ability and desire to ideate, co-create and implementoptimalsolutions Exposure to migrating or transforming ML infrastructure from traditional schedulers to modern, containerised systems Expertisewith high-throughput storage systems for ML/HPC workloads Expert-level understanding of GPU architecture, high-speed networking for distributed training, and performance profiling to resolve bottlenecks A solid grasp ofIaCand CI/CD practices (e.g., Terraform, Argo CD) We offer the following salary and benefits: Enhanced holiday pay Pension Life Assurance Income Protection Private Medical Insurance Hospital Cash Plan Therapy Services Perk Box Electric Car Scheme Why work for EIT: At the Ellison Institute, we believe a collaborative, inclusive team is key to our success. We are building a supportive environment where creative risks are encouraged, and everyone feels heard. Valuing emotional intelligence, empathy, respect, and resilience, we encourage people to be curious and to have a shared commitment to excellence. Join us and make an impact #J-18808-Ljbffr



  • Town of Oxford, Canada Ellison Institute of Technology Full time

    At the Ellison Institute of Technology (EIT), we’re on a mission to translate scientific discovery into real world impact. We bring together visionary scientists, technologists, policy makers, and entrepreneurs to tackle humanity’s greatest challenges in four transformative areas: Health, Medical Science & Generative BiologyFood Security & Sustainable...


  • Town of Oxford, Canada Ellison Institute of Technology Full time

    A leading technology institute in Nova Scotia, Canada, seeks an experienced professional to join its MLOps team. You will design and optimize high-performance ML compute clusters to facilitate scientific breakthroughs. The role requires proven experience in managing ML infrastructure, accommodating performance profiling, and enhancing efficiency. The...


  • Town of Oxford, Canada Ellison Institute of Technology Full time

    A leading technology institute in Nova Scotia, Canada, seeks an experienced professional to join its MLOps team. You will design and optimize high-performance ML compute clusters to facilitate scientific breakthroughs. The role requires proven experience in managing ML infrastructure, accommodating performance profiling, and enhancing efficiency. The...


  • Town of Oxford, Canada Rebellion Full time

    A leading game development company is seeking a skilled Machine Learning Engineer to join their innovative team. This role involves developing and optimizing ML systems, collaborating with diverse teams, and driving technological advancements in gaming. The ideal candidate will have proficiency in Python, C++, and knowledge of ML frameworks. This position is...


  • Town of Oxford, Canada Rebellion Full time

    A leading game development company is seeking a skilled Machine Learning Engineer to join their innovative team. This role involves developing and optimizing ML systems, collaborating with diverse teams, and driving technological advancements in gaming. The ideal candidate will have proficiency in Python, C++, and knowledge of ML frameworks. This position is...


  • Town of Oxford, Canada Carrington West Full time

    Are you an experienced Electrical Engineer ready to lead projects that shape sustainable and innovative infrastructure? Our client, a forward-thinking engineering consultancy, is seeking a Senior Electrical Engineer to join their growing team. This is a fantastic opportunity for a driven professional who thrives on delivering high-quality engineering...


  • Town of Oxford, Canada Carrington West Full time

    Are you an experienced Electrical Engineer ready to lead projects that shape sustainable and innovative infrastructure? Our client, a forward-thinking engineering consultancy, is seeking a Senior Electrical Engineer to join their growing team. This is a fantastic opportunity for a driven professional who thrives on delivering high-quality engineering...


  • Town of Oxford, Canada Rebellion Full time

    We want you to #JOINTHEREBELLION! For 30 years we’ve been independently developing and publishing incredible video games at our multiple studios founded by Jason & Chris Kingsley, but Rebellion is more than just games. We have our own film studio, we create board games, publish books, and through 2000AD, publish comics and graphic novels such as the...


  • Town of Oxford, Canada Rebellion Full time

    We want you to #JOINTHEREBELLION! For 30 years we’ve been independently developing and publishing incredible video games at our multiple studios founded by Jason & Chris Kingsley, but Rebellion is more than just games. We have our own film studio, we create board games, publish books, and through 2000AD, publish comics and graphic novels such as the...


  • Town of Oxford, Canada Carrington West Full time

    A forward-thinking engineering consultancy is seeking a Senior Electrical Engineer to lead diverse projects in Canada, Nova Scotia. This role involves managing teams, ensuring compliance with industry standards, and maintaining client relationships. Candidates should have a degree in Electrical Engineering and demonstrate leadership skills while working...