Machine Learning Engineer, Reinforcement Learning
2 weeks ago
At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law.
About Us
Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems.
Our vision is to create autonomy that propels the world forward. Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving.
In our fast-paced environment big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future.
At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact.
Make Wayve the experience that defines your career
The role
We're looking for a Machine Learning Engineer with strong experience in reinforcement learning (RL), reward modeling, and large-scale ML systems to advance how we train, evaluate, and deploy embodied AI behaviors. This role sits at the intersection of ML engineering, applied RL research, and ML systems, working on the frameworks that guide how our autonomous agents learn from data, simulation, and real-world experience.
As an MLE on the Accelerated Learning Loop team, you will:
- Design and optimise end-to-end pipelines for training reward models and RL agents, ensuring they are reproducible and high-throughput.
- Develop tooling for data processing, annotation, and inference within RL workflows.
- Build, refine, and deploy reward models that encode safe, interpretable, and effective driving behaviours.
- Integrate reward models with diverse data sources: real-world trajectories, simulation, and synthetic datasets.
- Conduct ablations, hyperparameter explorations, and controlled studies to analyse how reward structures, data composition, and training dynamics affect policy performance.
- Diagnose failure modes, investigate emergent behaviours, and iterate on reward objectives to improve reliability.
- Work closely with RL scientists to translate research ideas into scalable engineering solutions.
- Partner with evaluation teams to integrate reward and RL models into offline/online testing suites and simulation frameworks.
- Establish best practices around code quality, reproducibility, and deployment readiness.
- Build internal tools and visualisations that enable faster debugging, deeper insights, and more efficient iteration across the RL and reward modeling stack.
- This role is ideal for someone who enjoys building systems and running fast, grounded experiments. Someone who is motivated by delivering real impact on the behaviour of embodied AI systems in the real world.
Must-haves
- Experience applying reinforcement learning techniques, including offline RL, reward modeling, RLHF-style approaches, or similar
- Proficiency in Python and modern ML frameworks (e.g., PyTorch, JAX, Ray, or equivalent)
- Experience building ML pipelines or large-scale training workflows in production or research environments
- Strong understanding of simulation environments and/or real-world behavioural data
- Ability to design and run experiments, analyse results, and turn insights into actionable improvements
- Strong problem-solving skills and the ability to work effectively in cross-functional teams
Nice-to-haves
- Experience contributing to research (e.g., publications at NeurIPS, ICLR, CoRL, CVPR)
- Understanding of self-driving technologies, sensor data, or real-time decision-making algorithms
- Experience with distributed training systems and cloud compute environments (Azure, AWS, GCP)
- Exposure to large-scale simulation, embodied AI, or robotics systems
What we offer you
- Attractive compensation with salary and equity
- Immersion in a team of world-class researchers, engineers and entrepreneurs
- A unique position to shape the future of autonomy and tackle the biggest challenge of our time
- Bespoke learning and development opportunities
- Relocation support with visa sponsorship
- Flexible working hours - we trust you to do your job well, at times that suit you and your time
- Benefits such as an onsite chef, workplace nursery scheme, private health insurance, therapy, daily yoga, onsite bar, large social budgets, unlimited L&D requests, enhanced parental leave, and more
This is a full-time role based in our office in Vancouver. At Wayve we want the best of all worlds so we operate a hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home.
We understand that everyone has a unique set of skills and experiences and that not everyone will meet all of the requirements listed above. If you're passionate about self-driving cars and think you have what it takes to make a positive impact on the world, we encourage you to apply.
For more information visit Careers at Wayve.
To learn more about what drives us, visit Values at Wayve
DISCLAIMER: We will not ask about marriage or pregnancy, care responsibilities or disabilities in any of our job adverts or interviews. However, we do look to capture information about care responsibilities, and disabilities among other diversity information as part of an optional DEI Monitoring form to help us identify areas of improvement in our hiring process and ensure that the process is inclusive and non-discriminatory.
-
Vancouver, British Columbia, Canada Wayve Full timeAt Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status,...
-
Machine Learning Engineer
24 hours ago
Vancouver, British Columbia, Canada BMO Full timeThis is a Hybrid role (2 days/week in office)Role OverviewWe are seeking a highly analytical and technically proficient ML/AI Engineer to join our ARC team. This role is ideal for someone with a strong foundation in mathematics, statistics, and programming, and a passion for applying AI to solve complex financial problems. You will work to develop AI/ML/DS...
-
Machine Learning Engineer
1 day ago
Vancouver, British Columbia, Canada CD PROJEKT RED Full timeCompany Description To create revolutionary, story-driven RPGs which go straight to the hearts of gamers — this is our mission. Want to dive deeper into our company's culture? Explore our social media and check out our YouTube channel and Beyond the Game Blog where we share behind-the-scenes insights and stories direct from our team members Job...
-
Machine Learning Engineer
5 days ago
Vancouver, British Columbia, Canada Call For Referral Full timeMachine Learning EngineerHourly Contract | Part-Time Remote | $80 –$120 per hour1. About the RoleAt Mercor, we're building the talent engine that helps leading labs and research organizations move AI forward. Our newest initiative focuses on benchmarking and improving model performance and training speed across real machine learning workloads.If you're an...
-
Machine Learning Engineer
1 day ago
Vancouver, British Columbia, Canada Mastercard Full time $91,000 - $140,000Our PurposeMastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships...
-
Machine Learning Engineer
2 weeks ago
Vancouver, British Columbia, Canada Post UP Full timeAbout UsAt Post Up, we're building the future of professional filmmaking. We've been focused on automation and AI-orchestration for the Post Production process, ensuring that artists have access to tools that accelerate their talent and workflows. We believe in tackling big challenges, moving fast, and creating meaningful impact for our customers and...
-
Senior Machine Learning Engineer
3 days ago
Vancouver, British Columbia, Canada Electronic Arts (EA) Full timeDescription & RequirementsElectronic Arts creates next-level entertainment experiences that inspire players and fans around the world. Here, everyone is part of the story. Part of a community that connects across the globe. A place where creativity thrives, new perspectives are invited, and ideas matter. A team where everyone makes play happen.The ATOM team...
-
Senior Machine Learning Engineer
1 week ago
Vancouver, British Columbia, Canada Match Group Full time $116,000 - $160,000Match Group's Evergreen and Emerging (E&E) group is one of the world's leading online dating groups, comprising a large and prestigious dating portfolio (OkCupid, , Meetic, PlentyOfFish, Affinity, etc.) with millions of daily active members. We are based in Vancouver with a strong worldwide collaboration with other brands within the Match Group (NASDAQ:...
-
Senior Machine Learning Engineer
1 week ago
Vancouver, British Columbia, Canada Match Group Full time $116,000 - $160,000Match Group's Evergreen and Emerging (E&E) group is one of the world's leading online dating groups, comprising a large and prestigious dating portfolio (OkCupid, , Meetic, PlentyOfFish, Affinity, etc.) with millions of daily active members. We are based in Vancouver with a strong worldwide collaboration with other brands within the Match Group (NASDAQ:...
-
Senior Machine Learning Engineer
1 week ago
Vancouver, British Columbia, Canada Match Group Full timeMatch Group's Evergreen and Emerging (E&E) group is one of the world's leading online dating groups, comprising a large and prestigious dating portfolio (OkCupid, , Meetic, PlentyOfFish, Affinity, etc.) with millions of daily active members. We are based in Vancouver with a strong worldwide collaboration with other brands within the Match Group (NASDAQ:...