ML Compiler Architect, Senior Principal

5 days ago


Toronto, Canada d-Matrix Full time

Join to apply for the ML Compiler Architect, Senior Principal role at d-Matrix 5 days ago Be among the first 25 applicants Join to apply for the ML Compiler Architect, Senior Principal role at d-Matrix At d-Matrix , we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. Our culture is one of respect and collaboration.We value humility and believe in direct communication. Our team is inclusive , and our differing perspectives allow for better solutions. We are seeking individuals passionate about tackling challenges and are driven by execution. Ready to come find your playground? Together , we can help shape the endless possibilities of AI.Location:Hybrid, working onsite at our Toronto, Ontario, Canada headquarters 3-5 days per week.Role: Software Compiler Architect – MLIR/LLVM for Cloud InferenceWhat You Will DoAs a hands-on Front-End Software Compiler Architect focused on cloud-based AI inference, you will drive the design and implementation of a scalable MLIR-based compiler framework optimized for deploying large-scale NLP and transformer models in cloud environments. You will architect the end-to-end software pipeline that translates high-level AI models into efficient, low-latency executables on a distributed, multi-chiplet hardware platform featuring heterogeneous compute elements such as in-memory tensor processors, vector engines, and hierarchical memory.Your compiler designs will enable dynamic partitioning, scheduling, and deployment of inference workloads across a cloud-scale infrastructure, supporting both statically compiled and runtime-optimized execution paths. You will focus on compiler strategies that minimize inference latency, maximize throughput, and efficiently utilize compute and memory resources in data center environments, in addition to your work on developing the compiler.You will collaborate cross-functionally with systems architects, ML framework teams, runtime developers, performance engineers, and cloud orchestration groups to ensure seamless integration and optimized inference delivery at scale.Key ResponsibilitiesArchitect the MLIR-based compiler for cloud inference workloads, focusing on efficient mapping of large-scale AI models (e.g., LLMs, Transformers, Torch-MLIR) onto distributed compute and memory hierarchies. Lead the development of compiler passes for model partitioning, operator fusion, tensor layout optimization, memory tiling, and latency-aware scheduling. Design support for hybrid offline/online compilation and deployment flows with runtime-aware mapping, allowing for adaptive resource utilization and load balancing in cloud scenarios. Define compiler abstractions that interoperate efficiently with runtime systems, orchestration layers, and cloud deployment frameworks. Drive scalability, reproducibility, and performance through well-designed IR transformations and distributed execution strategies. Mentor and guide a team of compiler engineers to deliver high-performance inference-optimized software stacks. What You Will BringBS 15+ Yrs / MS 12+ Yrs / PhD 10+ Yrs Computer Science or Electrical Engineering, with 12+ years of experience in Front End Compiler and systems software development, with a focus on ML inference. Deep experience in designing or leading compiler efforts using MLIR, LLVM, Torch-MLIR, or similar frameworks. Strong understanding of model optimization for inference: quantization, fusion, tensor layout transformation, memory hierarchy utilization, and scheduling. Expertise in deploying ML models to heterogeneous compute environments, with specific attention to latency, throughput, and resource scaling in cloud systems. Proven track record working with AI frameworks (e.g., PyTorch, TensorFlow), ONNX, and hardware backends. Experience with cloud infrastructure, including resource provisioning, distributed execution, and profiling tools. Preferred QualificationsExperience targeting inference accelerators (AI ASICs, FPGAs, GPUs) in cloud-scale deployments. Knowledge of cloud deployment orchestration (e.g., Kubernetes, containerized AI workloads). Strong leadership skills with experience mentoring teams and collaborating with large-scale software and hardware organizations. Excellent written and verbal communication; capable of presenting complex compiler architectures and trade-offs to both technical and executive stakeholders. This role is a cornerstone of our cloud AI software strategy. You'll shape the way inference workloads are deployed, optimized, and scaled across data center infrastructure.Equal Opportunity Employment Policyd-Matrix is proud to be an equal opportunity workplace and affirmative action employer. We’re committed to fostering an inclusive environment where everyone feels welcomed and empowered to do their best work. We hire the best talent for our teams, regardless of race, religion, color, age, disability, sex, gender identity, sexual orientation, ancestry, genetic information, marital status, national origin, political affiliation, or veteran status. Our focus is on hiring teammates with humble expertise, kindness, dedication and a willingness to embrace challenges and learn together every day.d-Matrix does not accept resumes or candidate submissions from external agencies. We appreciate the interest and effort of recruitment firms, but we kindly request that individual interested in opportunities with d-Matrix apply directly through our official channels. This approach allows us to streamline our hiring processes and maintain a consistent and fair evaluation of al applicants. Thank you for your understanding and cooperation. Seniority level Seniority level Mid-Senior level Employment type Employment type Full-time Job function Job function Design, Art/Creative, and Information Technology Industries Semiconductor Manufacturing Referrals increase your chances of interviewing at d-Matrix by 2x Get notified about new Senior Architect jobs in Toronto, Ontario, Canada . Architect - Healthcare Projects - Western Canada Senior Architect/Technologist QA/QC and Contract Administrator Mississauga, Ontario, Canada CA$153,000.00-CA$170,000.00 2 weeks ago Senior/Lead System Architect (PEGA Certified) We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr



  • Toronto, Canada d-Matrix Full time

    Join to apply for the ML Compiler Architect, Senior Principal role at d-Matrix5 days ago Be among the first 25 applicantsJoin to apply for the ML Compiler Architect, Senior Principal role at d-MatrixAt d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and...


  • Toronto, Canada d-Matrix Full time

    Join to apply for the ML Compiler Architect, Senior Principal role at d-Matrix5 days ago Be among the first 25 applicantsJoin to apply for the ML Compiler Architect, Senior Principal role at d-MatrixAt d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and...


  • Toronto, Canada d-Matrix Full time

    Join to apply for the ML Compiler Architect, Senior Principal role at d-Matrix5 days ago Be among the first 25 applicantsJoin to apply for the ML Compiler Architect, Senior Principal role at d-MatrixAt d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and...


  • Toronto, Ontario, Canada d-Matrix Full time $200,000 - $300,000 per year

    At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. Our culture is one of respect and collaboration.We value humility and believe in direct communication. Our team is inclusive, and our...

  • ML Compiler Engineer

    2 weeks ago


    Toronto, Ontario, Canada Amazon Web Services (AWS) Full time US$140,000 - US$200,000 per year

    DescriptionThe Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium.The Product: The AWS Machine Learning accelerators (Inferentia/Trainium) offer unparalleled ML inference and training...


  • Toronto, Ontario, CAN, Canada Amazon Full time $120,000 - $180,000 per year

    The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. The Product: The AWS Machine Learning accelerators (Inferentia/Trainium) offer unparalleled ML inference and training performances....


  • Toronto, Canada Amazon Web Services (AWS) Full time

    OverviewJoin to apply for the ML Compiler Engineer, AWS Neuron, Annapurna Labs role at Amazon Web Services (AWS).At AWS our vision is to make deep learning pervasive for everyday developers and to democratize access to innovative infrastructure. In order to deliver on that vision, we’ve created innovative software and hardware solutions that make it...


  • Toronto, Canada Amazon Web Services (AWS) Full time

    Overview Join to apply for the ML Compiler Engineer, AWS Neuron, Annapurna Labs role at Amazon Web Services (AWS). At AWS our vision is to make deep learning pervasive for everyday developers and to democratize access to innovative infrastructure. In order to deliver on that vision, we’ve created innovative software and hardware solutions that make it...


  • Toronto, Canada Amazon Full time

    A leading technology company in Toronto is seeking a Senior Deep Learning Compiler Engineer to develop compilers targeting AWS Inferentia and Trainium. You will work at the intersection of machine learning and distributed architectures, mentoring a team while driving innovative solutions for large ML workloads. The ideal candidate has extensive software...

  • Principal ML Engineer

    4 weeks ago


    Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Launch Potato Full time

    Principal ML Engineer - Launch Potato Join to apply for the Principal ML Engineer role at Launch Potato. About Launch Potato Launch Potato is a profitable digital media company that reaches over 30M+ monthly visitors through brands such as FinanceBuzz, All About Cookies, and OnlyInYourState. As the discovery and conversion company, our mission is to connect...