Researcher - Reinforcement Learning

2 weeks ago


Street Northwest Edmonton Alberta TG C Canada Huawei Technologies Canada Co. Full time $85,000 - $140,000 per year
Job description

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.


About the team:

Founded in 2012, the Noah's Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab's mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long-term projects, the aim is to enhance state-of-the-art research while integrating innovations into the company's products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job:

  • Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine-tuning toward continual, agentic self-improvement.

  • LLM post-training paradigms (e.g., RLHF, GRPO, reward-free methods, etc.);

  • Agentic reinforcement learning for tool-using and browsing-based LLMs trained in interactive environments;

  • Agentic evaluation and benchmarking, including design of multi-turn, verifiable reasoning tasks.

  • Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning-enhanced LLMs and tool-using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.

Job requirements

About the ideal candidate:

  • PhD degree in Computer Science or related fields or master's degree with comparable experience.

  • Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.

  • Practical or research experience in reinforcement learning, self-supervised learning, or language model fine-tuning

  • Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.

  • Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.

  • Familiarity with LLM post-training pipelines (RLHF, GRPO/PPO, SFT, LoRA, MoE, etc.) is a strong asset.

  • Experience with multi-agent RL, tool-use / browser/coding agents, is a strong asset.

  • Strong communication and writing skills; enthusiasm for open research and collaborative problem-solving.

All done

Your application has been successfully submitted

Other jobs

  • - Street Northwest Edmonton, Alberta, TG C Canada Huawei Technologies Canada Co. Full time $80,000 - $120,000 per year

    Job description Huawei Canada has an immediate 12-month contract opening for a Researcher.About the team:The Software-Hardware System Optimization Lab continuously improves the power efficiency and performance of smartphone products through software-hardware systems optimization and architecture innovation. We keep tracking the trends of cutting-edge...


  • - Street Northwest Edmonton, Alberta, TG C Canada Huawei Technologies Canada Co. Full time $80,000 - $120,000 per year

    Job description Huawei Canada has an immediate 12-month contract opening for a Researcher.About the team:The Software-Hardware System Optimization Lab continuously improves the power efficiency and performance of smartphone products through software-hardware systems optimization and architecture innovation. We keep tracking the trends of cutting-edge...


  • Edmonton, Canada Huawei Technologies Canada Co., Ltd. Full time

    Job descriptionHuawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.About the team:Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to...


  • Edmonton, Canada Huawei Technologies Canada Co., Ltd. Full time

    Job descriptionHuawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.About the team:Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to...


  • Edmonton, Alberta, Canada Huawei Technologies Canada Co., Ltd. Full time $60,000 - $90,000 per year

    Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.About the team:Founded in 2012, the Noah's Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab's mission focuses on advancing artificial intelligence and related fields to benefit the company and...

  • Intern Researcher

    2 weeks ago


    - Street Northwest Edmonton, Alberta, TG C Canada Huawei Technologies Canada Co. Full time $60,000 - $120,000 per year

    Job description Huawei Canada has an immediate internship opening for a Researcher.About the team:Founded in 2012, the Noah's Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab's mission focuses on advancing artificial intelligence and related fields to benefit the company and society....


  • Hagey Boulevard Waterloo, Ontario, NL A Canada Huawei Technologies Canada Co. Full time $120,000 - $180,000 per year

    Job description Huawei Canada has an immediate a 12-month contract opening for an Engineer.About the team:The Intelligent Complex Systems Team, currently a part of the Waterloo Research Centre, examines recent advancements in artificial intelligence (AI) and robotics to determine its potential for broader applications. This innovative team researches AI...


  • Av du Parc Montreal, Quebec, HN X Canada Huawei Technologies Canada Co. Full time $100,000 - $120,000 per year

    Job description Huawei Canada has an immediate 12-month contract opening for a Researcher.About the team:Founded in 2012, the Noah's Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab's mission focuses on advancing artificial intelligence and related fields to benefit the company and...


  • Silver Drive Vancouver, British Columbia, VH Y Canada Huawei Technologies Canada Co. Full time $78,000 - $168,000 per year

    Job description Huawei Canada has an immediate 12-month contract opening for an Engineer.About the team:The Intelligent Cloud Infrastructure Lab aims to innovate technologies, algorithms, systems, and platforms for next-generation cloud infrastructure. The lab addresses scalability, performance, and resource utilization challenges in existing cloud...


  • , , Canada Datatonic Full time

    Senior Machine Learning Engineer (Reinforcement Learning) Join to apply for the Senior Machine Learning Engineer (Reinforcement Learning) role at Datatonic At Datatonic, we are Google Cloud's premier partner in AI , driving transformation for world‑class businesses. We push the boundaries of technology with expertise in machine learning, data engineering,...