Lead Engineer, Generative AI and Machine Learning
17 hours ago
Job Description
What is the opportunity?
This role offers a unique chance to pioneer the integration of Generative AI and machine learning (ML) into Site Reliability Engineering (SRE), driving transformative improvements in system reliability, efficiency, and scalability. You will work at the intersection of AI/ML innovation and cloud-native infrastructure, addressing critical challenges like anomaly detection, incident prediction, and automation. By leveraging cutting-edge technologies, you will empower organizations to minimize downtime, enhance observability, and optimize operational workflows, directly impacting business continuity and performance.
What will do you do?
- Design and deploy end-to-end AI/ML solutions to solve SRE challenges (e.g., log analysis, auto-remediation, and predictive maintenance).
- Develop models using supervised/unsupervised learning and Generative AI tools (e.g., LLMs, text-generation frameworks) to improve system resilience.
- Fine-tune models, engineer prompts, and integrate AI solutions with SRE tooling (monitoring systems, CI/CD pipelines).
- Collaborate with SRE, DevOps, and data science teams to scale solutions across cloud platforms (OCP, Azure).
- Translate AI insights into strategies for reducing downtime, automating tasks, and aligning with SRE principles (SLOs, error budgets).
- Build and maintain ML pipelines using Python, TensorFlow, PyTorch, and OpenAI APIs.
- Evaluate emerging AI technologies to advance reliability engineering practices.
What do you need to succeed?
- Technical Expertise: Strong experience in ML/Generative AI, Python, and frameworks like TensorFlow, PyTorch, or OpenAI APIs.
- SRE Knowledge: Familiarity with SRE concepts (SLOs, error budgets) and cloud-native environments (OCP, Azure).
- Problem-Solving Skills: Ability to address complex reliability challenges with AI-driven solutions.
- Collaboration: Effective teamwork with cross-functional teams (SRE, DevOps, data science).
- Innovation: Passion for exploring emerging AI technologies and advocating for novel approaches.
- Operational Focus: Commitment to ensuring scalable, production-ready deployments and optimizing model performance.
Must haves:
- Proven expertise in machine learning (ML) and Generative AI: Hands-on experience with frameworks like TensorFlow, PyTorch, or Hugging Face, and tools such as OpenAI APIs or LLMs.
- Strong programming skills in Python: Proficiency in developing and deploying ML models and pipelines.
- SRE/DevOps fundamentals: Familiarity with Site Reliability Engineering principles (e.g., SLOs, error budgets) and cloud-native infrastructure (OCP, Azure).
- Model deployment and scalability: Experience operationalizing ML models in production environments, including monitoring, maintenance, and optimization.
- Collaborative problem-solving: Ability to work with cross-functional teams (SRE, DevOps, data science) to translate technical insights into actionable solutions.
- Data analysis and engineering: Skills in preprocessing data, feature engineering, and working with large-scale datasets.
Nice to haves:
- Prompt engineering and fine-tuning: Experience optimizing Generative AI models (e.g., LLMs) for domain-specific tasks.
- MLOps/AIOps tools: Familiarity with ML pipeline orchestration (e.g., Kubeflow, MLflow) and SRE tooling (e.g., Prometheus, Kubernetes).
- Anomaly detection/time-series analysis: Prior work in predictive maintenance, incident forecasting, or log analysis for infrastructure systems.
- Open-source contributions: Active participation in AI/ML or SRE-related open-source projects.
- Cloud certifications: Advanced credentials (e.g., AWS Machine Learning Specialty, Google Cloud AI Engineer).
- Domain knowledge in observability: Experience with tools like Grafana, ELK Stack, or Splunk for enhancing system visibility.
What's in it for you?
We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.
- A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable
- Leaders who support your development through coaching and managing opportunities
- Ability to make a difference and lasting impact
- Work in a dynamic, collaborative, progressive, and high-performing team
- Flexible work/life balance options
- Opportunities to do challenging work
- Opportunities to take on progressively greater accountabilities
- Access to a variety of job opportunities across business and geographies
Job Skills
Agile Methodology, Group Problem Solving, IT Systems Integration, Organizational Leadership, Product Services, Software Development Life Cycle (SDLC), System Applications, System Integration Testing (SIT), Systems SoftwareAdditional Job Details
Address:
RBC WATERPARK PLACE, 88 QUEENS QUAY W:TORONTOCity:
TorontoCountry:
CanadaWork hours/week:
Employment Type:
Full timePlatform:
TECHNOLOGY AND OPERATIONSJob Type:
RegularPay Type:
SalariedPosted Date:
Application Deadline:
Note: Applications will be accepted until 11:59 PM on the day prior to the application deadline date above
Inclusion and Equal Opportunity Employment
At RBC, we believe an inclusive workplace that has diverse perspectives is core to our continued growth as one of the largest and most successful banks in the world. Maintaining a workplace where our employees feel supported to perform at their best, effectively collaborate, drive innovation, and grow professionally helps to bring our Purpose to life and create value for our clients and communities. RBC strives to deliver this through policies and programs intended to foster a workplace based on respect, belonging and opportunity for all.
Join our Talent Community
Stay in-the-know about great career opportunities at RBC. Sign up and get customized info on our latest jobs, career tips and Recruitment events that matter to you.
Expand your limits and create a new future together at RBC. Find out how we use our passion and drive to enhance the well-being of our clients and communities
-
Machine Learning Engineer II, ASR
1 week ago
Toronto, Ontario, Canada SoundHound AI Full time $120,000 - $180,000 per yearYour Career, our Future—Together.Ready to join something big? At SoundHound AI, we bring voice, generative, and conversational AI together to transform how people interact with products and services. From voice-enabled vehicles to food ordering and customer support, our multilingual, omnichannel technology already impacts hundreds of millions worldwide.The...
-
Machine Learning Engineer
1 week ago
Toronto, Ontario, Canada Boson AI Full time $150,000 - $400,000 per yearAbout Boson AI: At Boson AI, we are not just building AI solutions; we are pioneering the future of enterprise AI. Driven by a passion for cutting-edge AI research, particularly in the transformative areas of large language models and agentic systems, our mission is to tackle the most complex real-world problems for businesses and unlock significant value....
-
Machine Learning Engineer
2 days ago
Toronto, Ontario, Canada Quincus Full time $80,000 - $120,000 per year"Make every logistics journey your best one yet" The Company. Founded in 2014, Quincus is a B2B supply chain operating SaaS platform headquartered in Singapore. We solve today's global supply chain challenges with groundbreaking technology. Using AI and machine learning, we have digitized and optimized the logistics process while giving customers full...
-
Machine Learning Engineer
3 days ago
Toronto, Ontario, Canada Cresta Full time $120,000 - $180,000 per yearCresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Our platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices, automate conversations and inefficient processes, and empower every team...
-
Machine Learning Engineer
1 week ago
Toronto, Ontario, Canada Tekshapers Full time $120,000 - $180,000 per yearJob Title:GenAI ML EngineerLocation: Toronto, ONJob Type: FulltimeTotal Experience: 6-8 yearsRequired Skill Sets:• We are seeking a talented GenAI ML Engineer to develop and deploy cutting-edge generative AI solutions and large language models.• This role focuses on building innovative AI applications, fine-tuning LLMs, and implementing machine learning...
-
AI & Machine Learning Consultant
3 days ago
Toronto, Ontario, Canada OnX Full timeOverviewIn today's rapidly evolving environment, organizations need to make data-driven decisions that deliver enterprise value. Our OnX Cloud and Artificial Intelligence practitioners design, develop, and implement large-scale data ecosystems, leveraging cloud-based platforms to integrate structured and unstructured data. We utilize automation, cognitive,...
-
Machine Learning Engineer
1 week ago
Toronto, Ontario, Canada Tiger Analytics Full time $120,000 - $180,000 per yearTiger Analytics is an advanced analytics consulting firm. We are the trusted analytics partner for several Fortune 100 companies, enabling them to generate business value from data. Our consultants bring deep expertise in Data Science, Machine Learning, and AI. Our business value and leadership have been recognized by various market research firms, including...
-
Machine Learning Student Intern
3 days ago
Toronto, Ontario, Canada AV Machine Learning Full timeAbout Avolta:Avolta is a leading innovator in security solutions, dedicated to protecting critical assets and ensuring safety across various industries. With a strong focus on automotive security, we specialize in developing cutting-edge technologies, such as advanced anti-theft systems, to safeguard vehicles and enhance driver and passenger safety. By...
-
Machine Learning Researchers
15 hours ago
Toronto, Ontario, Canada Mercor Full time US$120,000 - US$1,120,000 per yearRole Overview Mercor is partnering with a leading AI research lab on Project Vesuvius, an initiative designed to evaluate and enhance the ability of large language models (LLMs) to generate structured, high-quality research plans for open-ended machine learning problems. We are seeking Machine Learning Researchers and PhDs to serve as annotators who will...
-
Machine Learning Engineer
1 week ago
Toronto, Ontario, Canada Tiger Analytics Full time $120,000 - $180,000 per yearTiger Analytics is an advanced analytics consulting firm. We are the trusted analytics partner for several Fortune 100 companies, enabling them to generate business value from data. Our consultants bring deep expertise in Data Science, Machine Learning, and AI. Our business value and leadership have been recognized by various market research firms, including...