Software Engineer, Systems ML

2 weeks ago


Old Toronto, Canada Meta Full time
In this role, you will be a member of the MTIA (Meta Training & Inference Accelerator) Software team and part of the bigger industry-leading PyTorch AI framework organization. MTIA Software Team has been developing a comprehensive AI Compiler strategy that delivers a highly flexible platform to train & serve new DL/ML model architectures, combined with auto-tuned high performance for production environments across specialized hardware architectures. The compiler stack, DL graph optimizations, and kernel authoring for specific hardware, directly impacts performance and deployment velocity of both AI training and inference platforms at Meta.You will be working on one of the core areas such as PyTorch framework components, AI compiler and runtime, high-performance kernels and tooling to accelerate machine learning workloads on the current & next generation of MTIA AI hardware platforms. You will work closely with AI researchers to analyze deep learning models and lower them efficiently on MTIA hardware. You will also partner with hardware design teams to develop compiler optimizations for high performance. You will apply software development best practices to design features, optimization, and performance tuning techniques. You will gain valuable experience in developing machine learning compiler frameworks and will help in driving next generation hardware software codesign for AI domain specific problems.-Dans ce rôle, vous serez membre de l'équipe logicielle MTIA et ferez partie de la plus grande organisation du cadre d'IA PyTorch, à la pointe de l'industrie. L'équipe logicielle de MTIA a développé une stratégie complète de compilateur d'IA qui offre une plateforme très flexible pour former et fournir de nouvelles architectures de modèles DL/ML, combinée à des performances élevées autorégulées pour les environnements de production sur des architectures matérielles spécialisées. La pile de compilateurs, les optimisations de graphes DL et la création de noyaux pour un matériel spécifique ont un impact direct sur les performances et la vitesse de développement des plates-formes d'apprentissage et d'inférence d'IA chez Meta.Vous travaillerez sur l'un des domaines clés tels que les composants du cadre PyTorch, le compilateur et le moteur d'exécution de l'IA, les noyaux de haute performance et l'outillage pour améliorer les charges de travail d'apprentissage automatique sur les plates-formes matérielles d'IA actuelles et de la prochaine génération de MTIA. Vous travaillerez en étroite collaboration avec les chercheurs en IA pour analyser les modèles d'apprentissage profond et les réduire efficacement sur le matériel MTIA. Vous vous associerez également aux équipes de conception de matériel pour développer des optimisations de compilateurs pour obtenir des performances élevées. Vous appliquerez les meilleures pratiques de développement logiciel pour concevoir des fonctionnalités, des techniques d'optimisation et de réglage des performances. Vous bénéficierez d'une expérience précieuse dans le développement de compilateurs d'apprentissage automatique et contribuerez à la conception de logiciels matériels de nouvelle génération pour les problèmes spécifiques du domaine d'IA.-

Software Engineer, Systems ML - Frameworks / Compilers / Kernels | Ingénieur logiciel, Systèmes ML, cadres/Compilateurs/Noyaux Responsibilities:
  • Development of SW stack with one of the following core focus areas: AI frameworks, compiler stack, high performance kernel development and acceleration onto next generation of hardware architectures. | Développement d'une pile de logiciels dans l'un des domaines fondamentaux suivants : Cadre d'IA, pile de compilateurs, développement de noyaux à haute performance et intégration dans les architectures matérielles de la prochaine génération.
  • Contribute to the development of the industry-leading PyTorch AI framework core compilers to support new state of the art inference and training AI hardware accelerators and optimize their performance. | Contribuer au développement des compilateurs de base du cadre d'IA PyTorch, leader sur le marché, afin de prendre en charge les nouveaux intégrateurs matériels d'IA de pointe en matière d'inférence, d'entraînement et d'optimiser leurs performances.
  • Analyze deep learning networks, develop & implement compiler optimization algorithms. | Analyse des réseaux d'apprentissage profond, développement et mise en œuvre d'algorithmes d'optimisation des compilateurs.
  • Collaborating with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc. | Collaborer avec des chercheurs en IA pour améliorer la prochaine génération de modèles d'apprentissage profond tels que les systèmes de recommandation, l'IA générative, la vision par ordinateur, le traitement automatique des langues, etc.
  • Performance tuning and optimizations of deep learning framework & software components. | Optimisation des performances du cadre d'apprentissage profond et des composants logiciels.


Minimum Qualifications:
  • Proven C/C++ programming skills | Compétences avérées en programmation C/C++
  • Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta. | Possède, ou est en train d'obtenir une licence en informatique, en génie informatique, dans un domaine technique pertinent, ou une expérience pratique équivalente. Le diplôme doit être obtenu avant de rejoindre Meta.
  • Experience in AI framework development or accelerating deep learning models on hardware architectures. | Expérience dans le développement des cadres d'IA ou dans l'amélioration des modèles d'apprentissage profond sur des architectures matérielles.


Preferred Qualifications:
  • A Bachelor's degree in Computer Science, Computer Engineering, relevant technical field and 4+ years of experience in AI framework development or accelerating deep learning models on hardware architectures OR a Master's degree in Computer Science, Computer Engineering, relevant technical field and 2+ years of experience in AI framework development or accelerating deep learning models on hardware architectures OR a PhD in Computer Science Computer Engineering, or relevant technical field. | Un baccalauréat en informatique, en ingénierie informatique ou dans un domaine technique pertinent et plus de quatre ans d'expérience dans le développement de cadres d'IA ou l'amélioration des modèles d'apprentissage profond sur des architectures matérielles OU une maîtrise en informatique, en ingénierie informatique ou dans un domaine technique pertinent et plus de deux ans d'expérience dans le développement de cadres d'IA ou l'amélioration des modèles d'apprentissage profond sur des architectures matérielles OU un doctorat en informatique, en ingénierie informatique ou dans un domaine technique pertinent.
  • Knowledge of GPU, CPU, or AI hardware accelerator architectures. | Connaissance des architectures d'accélérateurs matériels de type processeur graphique, unité centrale ou IA.
  • Experience working with frameworks like PyTorch, Caffe2, TensorFlow, ONNX, TensorRT | Expérience de travail avec des cadres comme PyTorch, Caffe2, TensorFlow, ONNX, TensorRT
  • OR AI high performance kernels: Experience with CUDA programming, OpenMP / OpenCL programming or AI hardware accelerator kernel programming. Experience in accelerating libraries on AI hardware, similar to cuBLAS, cuDNN, CUTLASS, HIP, ROCm etc. | Noyaux à haute performance pour l'IA : Expérience dans la programmation CUDA, OpenMP/OpenCL ou dans la programmation du noyau d'un accélérateur de matériel d'IA. Expérience dans l'accélération de bibliothèques sur du matériel d'IA, comme cuBLAS, cuDNN, CUTLASS, HIP, ROCm, etc.
  • OR AI Compiler: Experience with compiler optimizations such as loop optimizations, vectorization, parallelization, hardware specific optimizations such as SIMD. Experience with MLIR, LLVM, IREE, XLA, TVM, Halide is a plus. | OU compilateur d'IA : Expérience dans l`optimisation des compilateurs tels que l`optimisation des boucles, la vectorisation, la parallélisation, l`optimisation spécifique du matériel tel que SIMD. L'expérience avec MLIR, LLVM, IREE, XLA, TVM, Halide est un plus.
  • OR AI frameworks: Experience in developing training and inference framework components. Experience in system performance optimizations such as runtime analysis of latency, memory bandwidth, I/O access, compute utilization analysis and associated tooling development. | OU cadres d'IA : Expérience dans le développement de composants de formation et du cadre d'inférence. Expérience dans l'optimisation des performances systèmes, telle que l'analyse de la latence, de la largeur de bande de la mémoire, de l'accès aux E/S, de l'analyse de l'utilisation du calcul et du développement d'outils associés.


About Meta:
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today—beyond the constraints of screens, the limits of distance, and even the rules of physics.

CA$104,000/year to CA$148,000/year + bonus + equity + benefits

Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta. #J-18808-Ljbffr

  • Old Toronto, Ontario, Canada Cerebras Systems Full time

    About the RoleCerebras Systems is revolutionizing the field of artificial intelligence with its cutting-edge technology. As an ML Integration and Ops Engineer, you will play a crucial role in bringing together software and hardware components to make large-scale LLM model training simple and easy to use.Key ResponsibilitiesDrive technical projects involving...


  • Old Toronto, Canada Cerebras Systems Full time

    Cerebras has developed a radically new chip and system to dramatically accelerate deep learning applications. Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation. We are innovating at every level of the stack – from chip, to...

  • Software Engineer

    3 weeks ago


    Toronto, Ontario, Canada TD Bank Full time

    Software Engineer - ML EngineJob Summary:We are seeking a highly skilled Software Engineer to join our ML Engine team at TD Bank. As a key member of our team, you will be responsible for designing, developing, and deploying robust and scalable machine learning models.Main Responsibilities:Design and develop machine learning models using various algorithms...


  • Old Toronto, Ontario, Canada Workday, Inc. Full time

    Unlock Your Potential as a Principal Software Engineer - MLTransform the Future of Machine Learning at WorkdayAt Workday, we're revolutionizing the enterprise software market with a culture that puts our people first. As a Principal Software Engineer - ML, you'll be part of our pioneering Machine Learning Platform team, developing a cutting-edge platform...


  • Old Toronto, Ontario, Canada Workday, Inc. Full time

    Unlock Your Potential as a Principal Software Engineer - MLTransform the Future of Machine Learning at WorkdayAt Workday, we're revolutionizing the enterprise software market with a culture that puts our people first. As a Principal Software Engineer - ML, you'll be part of our pioneering Machine Learning Platform team, developing a cutting-edge platform...


  • Old Toronto, Ontario, Canada Workday, Inc. Full time

    Transform the Future of Machine Learning at WorkdayAt Workday, we're revolutionizing the enterprise software market with a culture that puts our people first. As a Principal Software Engineer on our Machine Learning (ML) team, you'll play a crucial role in developing a pioneering platform that enables our ML teams to handle and deploy their models. Our...


  • Old Toronto, Ontario, Canada Autodesk Full time

    Job Title: Principal Software Engineer, AI/ML PlatformJob Summary:We are seeking a highly skilled Principal Software Engineer to lead the development of our next-generation AI/ML platform. As a key member of our team, you will design and engineer software systems for the AI/ML Platform, contributing to the full ML development lifecycle. Your expertise in...


  • Old Toronto, Ontario, Canada Autodesk Full time

    Job Title: Principal Software Engineer, AI/ML PlatformJob Summary:We are seeking a highly skilled Principal Software Engineer to lead the development of our next-generation AI/ML platform. As a key member of our team, you will design and engineer software systems for the AI/ML Platform, contributing to the full ML development lifecycle. Your expertise in...


  • Old Toronto, Canada Cerebras Systems Full time

    Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation. We are innovating at every level of the stack – from chip, to microcode, to power delivery and cooling, to new algorithms and network architectures at the cutting edge of ML...


  • Old Toronto, Canada Autodesk Full time

    Job Requisition ID # 24WD82577 Position Overview We are seeking a dynamic and enthusiastic principal software engineer to develop our next-generation AI/ML platform used in the development of Autodesk’s suite of products and services. Join our dynamic and rapidly expanding team to help build innovative capabilities that enable faster and more secure...

  • Software Engineer

    1 month ago


    Toronto, Ontario, Canada TD Bank Full time

    Unlock Your Potential as a Software Engineer - ML ExpertLocation: CanadaSchedule: 37.5 hours per weekIndustry: Data and AnalyticsCompensation Details:We offer a competitive compensation package that reflects our commitment to fairness and equity. As a candidate, we encourage you to discuss compensation with your recruiter, including specific salary details...

  • Software Engineer

    1 month ago


    Toronto, Ontario, Canada TD Bank Full time

    Unlock Your Potential as a Software Engineer - ML ExpertLocation: CanadaSchedule: 37.5 hours per weekIndustry: Data and AnalyticsCompensation Details:We offer a competitive compensation package that reflects our commitment to fairness and equity. As a candidate, we encourage you to discuss compensation with your recruiter, including specific salary details...


  • Old Toronto, Canada Workday, Inc. Full time

    Principal Software Engineer - MLYour work days are brighter here.At Workday, it all began with a conversation over breakfast. When our founders met at a sunny California diner, they came up with an idea to revolutionize the enterprise software market. And when we began to rise, one thing that really set us apart was our culture. A culture which was driven by...

  • Lead Product Engineer

    3 weeks ago


    Old Toronto, Canada https:www.energyjobline.comsitemap.xml Full time

    About Fusemachines Fusemachines is a leading AI strategy, talent, and education services and products provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic) and more than 400...

  • Lead Product Engineer

    4 months ago


    Old Toronto, Canada Fusemachines Full time

    About Fusemachines Fusemachines is a leading AI strategy, talent, and education services and products provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic, and more than 400...

  • Lead Product Engineer

    4 months ago


    Old Toronto, Canada Fusemachines Full time

    ```html About Fusemachines Fusemachines is a leading AI strategy, talent, and education services and products provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic) and more than 400...


  • Old Toronto, Ontario, Canada Nexus Systems Group Inc. Full time

    Job Title: AI/ML Solution ArchitectWe are seeking a highly skilled AI/ML Solution Architect to join our team at Nexus Systems Group Inc. as a key member of our technology leadership team.Key Responsibilities:Design and develop architectural solutions for AI and machine learning applications that align with our business and technology strategy.Develop and...


  • Old Toronto, Ontario, Canada Nexus Systems Group Inc. Full time

    Job Title: AI/ML Solution ArchitectWe are seeking a highly skilled AI/ML Solution Architect to join our team at Nexus Systems Group Inc. as a key member of our technology leadership team.Key Responsibilities:Design and develop architectural solutions for AI and machine learning applications that align with our business and technology strategy.Develop and...


  • Old Toronto, Ontario, Canada Cerebras Systems Full time

    About The RoleCerebras Systems is revolutionizing the field of machine learning with its cutting-edge technology. As an MTS (ML Integration and Ops Engineer), you will play a crucial role in bringing together software and hardware components to make large-scale LLM model training simple and easy to use.You will be part of the MIQ (ML Integration and Quality)...

  • Software Engineer

    3 months ago


    Old Toronto, Canada Cresta CTO & co Full time

    Software Engineer (ML Platform - Chat Agent)Are you ready to redefine the future of work with cutting-edge AI? At Cresta, we're on a groundbreaking mission to supercharge the effectiveness of knowledge workers, making them 100x more productive, 10x faster, and 10x better.Imagine transforming Call Center operations with our real-time agent assist product and...