Current jobs related to Senior Site Reliability Engineer - Toronto, Ontario - RBC


  • Toronto, Ontario, Canada RBC Full time $90,000 - $120,000 per year

    Job DescriptionWhat is the opportunity?Join our Commercial, Core Banking and Payments Technology (CCBPT) team as a Senior Site Reliability Engineer, where you'll play a key role in supporting our cloud and distributed environments for the Personal Commercial Credit SRE & Ops team. This exciting opportunity will challenge you to work with cutting-edge...


  • Toronto, Ontario, Canada Procom Full time $80,000 - $120,000 per year

    Site Reliability Engineer (SRE)/ Ingénieur Fiabilité des SitesOn behalf of our banking client, Procom is seeking a Site Reliability Engineer (SRE) for a 12-month contract role. This position is a hybrid role, 3 days a week onsite at our client's Montréal, Quebec office.Site Reliability Engineer - Job Description:The Site Reliability Engineer is...


  • Toronto, Ontario, Canada 3cf5cb8c-b08d-42c2-a6cd-1ee0c7026e02 Full time $120,000 - $180,000 per year

    About Us:Zensurance is redefining commercial insurance for Canadian businesses.As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive.Zensurance has...


  • Toronto, Ontario, Canada Zensurance Full time $120,000 - $180,000 per year

    About Us: Zensurance is redefining commercial insurance for Canadian businesses As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive Zensurance has...


  • Toronto, Ontario, Canada Zensurance Full time $120,000 - $180,000 per year

    About Us: Zensurance is redefining commercial insurance for Canadian businesses.  As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive. Zensurance...


  • Toronto, Ontario, Canada Zensurance Full time $900,000 - $1,200,000 per year

    About Us:Zensurance is redefining commercial insurance for Canadian businesses. As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive.Zensurance has...


  • Toronto, Ontario, Canada Kablamo Full time $90,000 - $120,000 per year

    Reports to: Technical Support ManagerLocation: Toronto (Hybrid)Role Type: Full timeLevel: Intermediate/MidIntroductionKablamo is a fast-growing cloud digital product development company. Founded in 2017 in Australia, the business has grown quickly over the last several years, including the expansion of the team to Canada in 2021. We are proud to have...


  • Toronto, Ontario, Canada Apex Systems Full time $120,000 - $180,000 per year

    Senior Site Reliability EngineerApex Systems is a global IT services provider, and our staffing practice has an opening for an SRE with extensive OpenShift Clusters experience, strong GitOps and ArgoCD knowledge, and solid F5 LTM load balancer configuration capabilities to place at our client, an industry leading technology company.Client:A Fortune 100...


  • Toronto, Ontario, Canada Maneva Full time US$80,000 - US$120,000 per year

    About ManevaManeva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....


  • Toronto, Ontario, Canada McCain Foods Full time $102,700 - $137,000 per year

    Position Title:Site Reliability EngineerPosition Type:Regular - Full-TimePosition Location:Toronto HQRequisition ID:36904Our Global Technology team's goal is to leverage technology and data to drive profitable growth, focus on enhancing customer experience and to further our purpose of 'Celebrating real connections through delicious, planet-friendly food'....

Senior Site Reliability Engineer

2 weeks ago


Toronto, Ontario, Canada RBC Full time $120,000 - $180,000 per year

Job Description

What is the Opportunity?

We are seeking an experienced and skilled Senior Site Reliability Engineer and 
System's Specialist to join our team, responsible for ensuring the stability, reliability, and performance of our mission-critical application.

The ideal candidate will possess a strong technical background in Linux administration, scripting, automation, and database management. This is a critical role that requires a high degree of technical expertise, attention to detail, and excellent problem-solving skills.

What will you do?

  • Provide expert-level support and maintenance for our mission-critical application, ensuring high availability and performance

  • Collaborate with cross-functional teams to identify and resolve technical issues, and implement preventative measures to minimize downtime

  • Develop and maintain automation scripts using Python or shell scripting to streamline application maintenance and deployment tasks

  • Design and implement DevOps/SRE automation solutions to improve application reliability, scalability, and efficiency

  • Administer and troubleshoot Linux-based systems, including configuration, security, and performance optimization

  • Develop and maintain SQL scripts to support data analysis, reporting, and application functionality

  • Participate in on-call rotations to provide 24/7 support for critical application issues

  • Collaborate with development teams to ensure smooth deployment of new features and updates

  • Develop and maintain technical documentation to support application maintenance and troubleshooting

1. Reliability & Performance Engineering

  • Design, implement, and maintain scalable systems with high availability, reliability, and performance.

  • Define and monitor SLAs, SLOs, and SLIs; drive observability improvements.

  • Conduct capacity planning, performance tuning, and system optimization.

  • Develop and implement disaster recovery and business continuity strategies.

2. Automation & Infrastructure as Code

  • Develop and maintain Infrastructure as Code (IaC) using tools like Copilot, RBC Assist etc.

  • Build automation for CI/CD pipelines to streamline software delivery and deployment.

  • Automate routine operational tasks to improve efficiency and reduce human error.

  • Create and maintain reliable deployment processes, including blue-green and canary releases.

3. Monitoring, Incident Response & Root Cause Analysis

  • Own on-call responsibilities and develop processes to reduce alert fatigue.

  • Lead incident response efforts, including communication and postmortem documentation.

  • Implement and enhance monitoring and alerting systems (e.g., Prometheus, Grafana, Datadog).

  • Champion blameless postmortems and drive systemic fixes to recurring issues.

4. Collaboration, Governance & Mentorship

  • Collaborate closely with development, security, and operations teams to embed reliability practices.

  • Drive SRE best practices across teams and influence architecture and design decisions.

  • Participate in internal audits and compliance activities related to infrastructure and availability.

  • Mentor junior SREs and contribute to internal knowledge-sharing and documentation.

What do you need to succeed?

Must-Have:

  • Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field.

  • 3+ years' experience with system administration in RedHat Linux OS OR Apache, Solar

  • 1+ years' experience with Ansible

  • Strong experience with Python scripting

    • Experience providing Production Support

    • Experience with monitoring/SRE tools like Dynatrace, PagerDuty, ELK Stack

Nice to haves:

  • Knowledge/experience with AI (Agents, LLMS etc.)

  • Knowledge of Manage File Transfer platforms.

  • Knowledge of DevOps tools like Git, Docker, Jenkins, and Kubernetes.

What's in it for you?

We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.

  • A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable

  • Leaders who support your development through coaching and managing opportunities

  • Ability to make a difference and lasting impact

  • Work in a dynamic, collaborative, progressive, and high-performing team

  • A world-class training program in financial services

  • Flexible work/life balance options

  • Opportunities to do challenging work

  • Opportunities to take on progressively greater accountabilities 

  • Opportunities to building close relationships with clients

#LI-POST
#TECHPJ

Job Skills

Agile Methodology, Agile Methodology, Ansible Tower, Application Infrastructure, Application Production Support, Automation, DevOps, Generative AI, Generative Programming, Group Problem Solving, IT Automation, IT Monitoring, IT Production Support, Large Language Models (LLMs), Linux Server Administration, Linux Shell Scripting, Operations Support, PL/SQL (Programming Language), Production Support, Python (Programming Language), Red Hat Ansible, Red Hat OpenShift, Software Development Life Cycle (SDLC), Software Engineering, Software Product Technical Knowledge {+ 2 more}

Additional Job Details

Address:

180 WELLINGTON ST W:TORONTO

City:

Toronto

Country:

Canada

Work hours/week:

Employment Type:

Full time

Platform:

TECHNOLOGY AND OPERATIONS

Job Type:

Regular

Pay Type:

Salaried

Posted Date:

Application Deadline:

Note: Applications will be accepted until 11:59 PM on the day prior to the application deadline date above

Inclusion and Equal Opportunity Employment

At RBC, we believe an inclusive workplace that has diverse perspectives is core to our continued growth as one of the largest and most successful banks in the world. Maintaining a workplace where our employees feel supported to perform at their best, effectively collaborate, drive innovation, and grow professionally helps to bring our Purpose to life and create value for our clients and communities. RBC strives to deliver this through policies and programs intended to foster a workplace based on respect, belonging and opportunity for all.

Join our Talent Community

Stay in-the-know about great career opportunities at RBC. Sign up and get customized info on our latest jobs, career tips and Recruitment events that matter to you.

Expand your limits and create a new future together at RBC. Find out how we use our passion and drive to enhance the well-being of our clients and communities