Machine Learning Engineer, Reinforcement Learning

2 weeks ago

Vancouver, British Columbia, Canada Wayve Full time

At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law.

About Us
Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems.

Our vision is to create autonomy that propels the world forward. Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving.

In our fast-paced environment big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future.

At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact.

Make Wayve the experience that defines your career

The role

We're looking for a Machine Learning Engineer with strong experience in reinforcement learning (RL), reward modeling, and large-scale ML systems to advance how we train, evaluate, and deploy embodied AI behaviors. This role sits at the intersection of ML engineering, applied RL research, and ML systems, working on the frameworks that guide how our autonomous agents learn from data, simulation, and real-world experience.

As an MLE on the Accelerated Learning Loop team, you will:

Design and optimise end-to-end pipelines for training reward models and RL agents, ensuring they are reproducible and high-throughput.
Develop tooling for data processing, annotation, and inference within RL workflows.
Build, refine, and deploy reward models that encode safe, interpretable, and effective driving behaviours.
Integrate reward models with diverse data sources: real-world trajectories, simulation, and synthetic datasets.
Conduct ablations, hyperparameter explorations, and controlled studies to analyse how reward structures, data composition, and training dynamics affect policy performance.
Diagnose failure modes, investigate emergent behaviours, and iterate on reward objectives to improve reliability.
Work closely with RL scientists to translate research ideas into scalable engineering solutions.
Partner with evaluation teams to integrate reward and RL models into offline/online testing suites and simulation frameworks.
Establish best practices around code quality, reproducibility, and deployment readiness.
Build internal tools and visualisations that enable faster debugging, deeper insights, and more efficient iteration across the RL and reward modeling stack.
This role is ideal for someone who enjoys building systems and running fast, grounded experiments. Someone who is motivated by delivering real impact on the behaviour of embodied AI systems in the real world.

Must-haves

Experience applying reinforcement learning techniques, including offline RL, reward modeling, RLHF-style approaches, or similar
Proficiency in Python and modern ML frameworks (e.g., PyTorch, JAX, Ray, or equivalent)
Experience building ML pipelines or large-scale training workflows in production or research environments
Strong understanding of simulation environments and/or real-world behavioural data
Ability to design and run experiments, analyse results, and turn insights into actionable improvements
Strong problem-solving skills and the ability to work effectively in cross-functional teams

Nice-to-haves

Experience contributing to research (e.g., publications at NeurIPS, ICLR, CoRL, CVPR)
Understanding of self-driving technologies, sensor data, or real-time decision-making algorithms
Experience with distributed training systems and cloud compute environments (Azure, AWS, GCP)
Exposure to large-scale simulation, embodied AI, or robotics systems

What we offer you

Attractive compensation with salary and equity
Immersion in a team of world-class researchers, engineers and entrepreneurs
A unique position to shape the future of autonomy and tackle the biggest challenge of our time
Bespoke learning and development opportunities
Relocation support with visa sponsorship
Flexible working hours - we trust you to do your job well, at times that suit you and your time
Benefits such as an onsite chef, workplace nursery scheme, private health insurance, therapy, daily yoga, onsite bar, large social budgets, unlimited L&D requests, enhanced parental leave, and more

This is a full-time role based in our office in Vancouver. At Wayve we want the best of all worlds so we operate a hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home.

We understand that everyone has a unique set of skills and experiences and that not everyone will meet all of the requirements listed above. If you're passionate about self-driving cars and think you have what it takes to make a positive impact on the world, we encourage you to apply.

For more information visit Careers at Wayve.

To learn more about what drives us, visit Values at Wayve

DISCLAIMER: We will not ask about marriage or pregnancy, care responsibilities or disabilities in any of our job adverts or interviews. However, we do look to capture information about care responsibilities, and disabilities among other diversity information as part of an optional DEI Monitoring form to help us identify areas of improvement in our hiring process and ensure that the process is inclusive and non-discriminatory.

Machine Learning Engineer, Reinforcement Learning

3 days ago

Vancouver, British Columbia, Canada Wayve Full time

At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status,...
Machine Learning Engineer

24 hours ago

Vancouver, British Columbia, Canada BMO Full time

This is a Hybrid role (2 days/week in office)Role OverviewWe are seeking a highly analytical and technically proficient ML/AI Engineer to join our ARC team. This role is ideal for someone with a strong foundation in mathematics, statistics, and programming, and a passion for applying AI to solve complex financial problems. You will work to develop AI/ML/DS...
Machine Learning Engineer

1 day ago

Vancouver, British Columbia, Canada CD PROJEKT RED Full time

Company Description To create revolutionary, story-driven RPGs which go straight to the hearts of gamers — this is our mission. Want to dive deeper into our company's culture? Explore our social media and check out our YouTube channel and Beyond the Game Blog where we share behind-the-scenes insights and stories direct from our team members Job...
Machine Learning Engineer

5 days ago

Vancouver, British Columbia, Canada Call For Referral Full time

Machine Learning EngineerHourly Contract | Part-Time Remote | $80 –$120 per hour1. About the RoleAt Mercor, we're building the talent engine that helps leading labs and research organizations move AI forward. Our newest initiative focuses on benchmarking and improving model performance and training speed across real machine learning workloads.If you're an...
Machine Learning Engineer

1 day ago

Vancouver, British Columbia, Canada Mastercard Full time $91,000 - $140,000

Our PurposeMastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships...
Machine Learning Engineer

2 weeks ago

Vancouver, British Columbia, Canada Post UP Full time

About UsAt Post Up, we're building the future of professional filmmaking. We've been focused on automation and AI-orchestration for the Post Production process, ensuring that artists have access to tools that accelerate their talent and workflows. We believe in tackling big challenges, moving fast, and creating meaningful impact for our customers and...
Senior Machine Learning Engineer

3 days ago

Vancouver, British Columbia, Canada Electronic Arts (EA) Full time

Description & RequirementsElectronic Arts creates next-level entertainment experiences that inspire players and fans around the world. Here, everyone is part of the story. Part of a community that connects across the globe. A place where creativity thrives, new perspectives are invited, and ideas matter. A team where everyone makes play happen.The ATOM team...
Senior Machine Learning Engineer

1 week ago

Vancouver, British Columbia, Canada Match Group Full time $116,000 - $160,000

Match Group's Evergreen and Emerging (E&E) group is one of the world's leading online dating groups, comprising a large and prestigious dating portfolio (OkCupid, , Meetic, PlentyOfFish, Affinity, etc.) with millions of daily active members. We are based in Vancouver with a strong worldwide collaboration with other brands within the Match Group (NASDAQ:...
Senior Machine Learning Engineer

1 week ago

Vancouver, British Columbia, Canada Match Group Full time $116,000 - $160,000

Match Group's Evergreen and Emerging (E&E) group is one of the world's leading online dating groups, comprising a large and prestigious dating portfolio (OkCupid, , Meetic, PlentyOfFish, Affinity, etc.) with millions of daily active members. We are based in Vancouver with a strong worldwide collaboration with other brands within the Match Group (NASDAQ:...
Senior Machine Learning Engineer

1 week ago

Vancouver, British Columbia, Canada Match Group Full time

Match Group's Evergreen and Emerging (E&E) group is one of the world's leading online dating groups, comprising a large and prestigious dating portfolio (OkCupid, , Meetic, PlentyOfFish, Affinity, etc.) with millions of daily active members. We are based in Vancouver with a strong worldwide collaboration with other brands within the Match Group (NASDAQ:...

Americas

Europe

Asia / Oceania

Africa

Machine Learning Engineer, Reinforcement Learning