Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs

2 weeks ago


Toronto, Canada Amazon Web Services (AWS) Full time

Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna LabsJoin to apply for the Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs role at Amazon Web Services (AWS).The Annapurna Labs team at AWS builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium. The Acceleration Kernel Library team focuses on maximizing performance for AWS’s custom ML accelerators. This role involves crafting high-performance kernels for ML functions at the hardware-software boundary to ensure optimal performance for demanding workloads. You will work across frameworks, compilers, runtime, and collectives, contributing to future architecture designs and customer enablement. This is an opportunity to work at the intersection of machine learning, high-performance computing, and distributed architectures, shaping the future of AI acceleration technology.This is a chance to work on cutting-edge products, architect and implement business-critical features, publish research, and mentor engineers in a small, agile team that values experimentation and learning. The team collaborates closely with customers on model enablement, providing optimization expertise for ML workloads on AWS accelerators.Explore the product and our history:https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-cc/index.htmlhttps://aws.amazon.com/machine-learning/neuron/https://github.com/aws/aws-neuron-sdkhttps://www.amazon.science/how-silicon-innovation-became-the-secret-sauce-behind-awss-successKey job responsibilitiesDesign and implement high-performance compute kernels for ML operations, leveraging the Neuron architecture and programming modelsAnalyze and optimize kernel-level performance across multiple generations of Neuron hardwareConduct detailed performance analysis using profiling tools to identify and resolve bottlenecksImplement compiler optimizations such as fusion, sharding, tiling, and schedulingWork directly with customers to enable and optimize their ML models on AWS acceleratorsCollaborate across teams to develop innovative kernel optimization techniquesA day in the lifeAs you design and code solutions to drive efficiencies in software architecture, you’ll create metrics, implement automation and other improvements, and resolve root causes of software defects. You’ll build high-impact solutions for a large customer base, participate in design discussions and code reviews, and work cross-functionally to drive business decisions with your technical input. You’ll thrive in a startup-like development environment focused on the most important work.About The TeamDiversity of experiences is valued; candidates not meeting every qualification are encouraged to apply.Why AWS: AWS is a leading cloud platform trusted by startups to Global 500 companies.Inclusive team culture with employee affinity groups and leadership principles guiding collaboration.Work/Life balance with flexible hours.Mentorship and career growth opportunities.Basic Qualifications5+ years of non-internship professional software development experience5+ years of programming with at least one programming language5+ years of leading design or architecture of systemsExperience as a mentor, tech lead, or leading an engineering teamPreferred Qualifications5+ years of full software development lifecycle experienceBachelor’s degree in computer science or equivalentExpertise in accelerator architectures for ML or HPC (GPUs, CPUs, FPGAs, or custom)Experience with GPU kernels and backends (CUDA, OpenCL, SYCL, ROCm, etc.)Experience with NVIDIA PTX and/or AMD GPU ISAExperience developing high performance libraries for HPCProficiency in low-level GPU performance optimizationExperience with LLVM/MLIR backend development for GPUsKnowledge of ML frameworks (PyTorch, TensorFlow) and their GPU backendsExperience with parallel programming and optimization techniquesUnderstanding of GPU memory hierarchies and optimization strategiesAmazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status. If you require a workplace accommodation during the application or hiring process, please visit amazon.jobs/accommodations for more information.Company - Amazon Development Centre Canada ULCJob ID: A3059954Seniority levelMid-Senior levelEmployment typeFull-timeJob functionInformation Technology, Consulting, and EngineeringIndustriesIT Services and IT ConsultingReferrals increase your chances of interviewing at Amazon Web Services (AWS) by 2x. Get notified about new Senior Performance Engineer jobs in Toronto, Ontario, Canada. #J-18808-Ljbffr



  • Toronto, Canada Amazon Web Services (AWS) Full time

    Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna LabsJoin to apply for the Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs role at Amazon Web Services (AWS).The Annapurna Labs team at AWS builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning...


  • Toronto, Canada Amazon Web Services (AWS) Full time

    Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna LabsJoin to apply for the Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs role at Amazon Web Services (AWS).The Annapurna Labs team at AWS builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning...


  • Toronto, Canada Amazon Web Services (AWS) Full time

    Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs Join to apply for the Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs role at Amazon Web Services (AWS). The Annapurna Labs team at AWS builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning...


  • Toronto, Canada Amazon Web Services (AWS) Full time

    ML Kernel Performance Engineer, AWS Neuron, Annapurna LabsThe Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium.The Acceleration Kernel Library team is at the forefront of maximizing...


  • Toronto, Canada Amazon Web Services (AWS) Full time

    ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium.The Acceleration Kernel Library team is at the forefront of maximizing...


  • Toronto, Canada Amazon Full time

    DescriptionThe Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium.The Acceleration Kernel Library team is at the forefront of maximizing performance for AWS's custom ML accelerators....


  • Toronto, Canada Amazon Full time

    The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium. The Acceleration Kernel Library team is at the forefront of maximizing performance for AWS's custom ML accelerators. Working at...


  • Toronto, Ontario, Canada Amazon Full time $120,000 - $180,000 per year

    The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium.The Acceleration Kernel Library team is at the forefront of maximizing performance for AWS's custom ML accelerators. Working at the...


  • Toronto, Ontario, CAN, Canada Amazon Full time $180,000 - $250,000 per year

    The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium.The Acceleration Kernel Library team is at the forefront of maximizing performance for AWS's custom ML accelerators. Working at the...


  • Toronto, Ontario, Canada Amazon Full time $120,000 - $180,000 per year

    DESCRIPTIONThe Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. The Acceleration Kernel Library team is at the forefront of maximizing performance for AWS's custom ML accelerators....