Senior Site Reliability Engineer

1 day ago


Toronto, Ontario, Canada RBC Full time

Job Description

What is the opportunity?

Join our Commercial, Core Banking and Payments Technology (CCBPT) team as a Senior Site Reliability Engineer, where you'll play a key role in supporting our cloud and distributed environments for the Personal Commercial Credit SRE & Ops team. This exciting opportunity will challenge you to work with cutting-edge technologies, including AI and emerging innovations, and collaborate closely with development teams to deliver embedded SRE solutions. As a vital link between QE, DevOps, Development, Infrastructure, and Support teams, you'll leverage your strong technical skills to solve complex problems and drive success across multiple components and technologies. If you're passionate about tackling new challenges and developing innovative solutions, we invite you to join our team and take your career to the next level.

What will you do?

  • Manage a small team of SREs 
  • Automate, automate and automate – Identify, design, write and test automation procedures using AI, Ansible and other relevant technologies
  • Support applications running on many platforms including OpenShift and distributed systems
  • Design and implement Chaos Engineering experiments and Disaster Recovery procedures to test and validate system resilience and reliability
  • Establishing and monitoring SLO and supporting SLIs for various applications
  • Responsible for developing and establishing observability strategies for applications
  • Build and implement monitoring and alerting, anomaly detection, self-healing and reliability testing for applications in scope
  • Provide leadership and technical support for OAs, developers and DevOps engineers
  • Support incident management and problem management for applications in scope and RCA Action items fulfillment/ownership
  • Be an escalation point in the on-call rotation, and support our maintenance, scheduled work, support, and release deployment requirements

What do you need to succeed?

Must-have

  • 5+ years of experience as Site Reliability Engineer
  • A Bachelor's degree in Computer Science or related technical field (Example: Mathematics/Engineering/Physics), or equivalent practical experience
  • Strong Kubernetes and Cloud working knowledge with experience and understanding of CICD pipeline and DevOps / Agile Methodology
  • Advanced knowledge of the following SRE practices and technologies: Python, YAML, Shell scripting, OpenShift, Linux, MongoDB, Dynatrace, Prometheus, PagerDuty, Moog, Splunk, Elastic, Ansible, Grafana, Chaos Engineering, MQ, Kafka
  • Perform production support role, including off-hours support
  • Effective negotiation skills, and stakeholder management
  • Excellent communication skills

Nice-to-have

  • Strong knowledge in AI and building AI-based solutions
  • Some experience managing people 
  • Coding experience in Java/Python and/or scripting with Shell, Bash, Groovy or Javascript
  • Knowlege of process automation and orchestration platforms like Camunda is a plus 
  • Knowledge of deploying and supporting distributed applications
  • In-depth hands-on experience in a variety of SRE tools (Ansible, Catchpoint)
  • Experience working as an SRE within the Financial Industry

What's in it for you?

We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.

  • A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable
  • Leaders who support your development through coaching and managing opportunities
  • Work in a dynamic, collaborative, progressive, and high-performing team
  • Opportunities to do challenging work in AI and emerging technologies
  • Opportunities to take on progressively greater accountabilities
  • Access to a variety of job opportunities across business and geographies

Job Skills

Agile Methodology, AI Programming, Chaos Engineering, Cloud Technology, Distributed Environments, Emerging Technologies, Kubernetes, Payment Handling, People Management, Python (Programming Language), Red Hat Ansible, SRE Observability

Additional Job Details

Address:

RBC WATERPARK PLACE, 88 QUEENS QUAY W:TORONTO

City:

Toronto

Country:

Canada

Work hours/week:

Employment Type:

Full time

Platform:

TECHNOLOGY AND OPERATIONS

Job Type:

Regular

Pay Type:

Salaried

Posted Date:

Application Deadline:

Note: Applications will be accepted until 11:59 PM on the day prior to the application deadline date above

Inclusion and Equal Opportunity Employment

At RBC, we believe an inclusive workplace that has diverse perspectives is core to our continued growth as one of the largest and most successful banks in the world. Maintaining a workplace where our employees feel supported to perform at their best, effectively collaborate, drive innovation, and grow professionally helps to bring our Purpose to life and create value for our clients and communities. RBC strives to deliver this through policies and programs intended to foster a workplace based on respect, belonging and opportunity for all.

Join our Talent Community

Stay in-the-know about great career opportunities at RBC. Sign up and get customized info on our latest jobs, career tips and Recruitment events that matter to you.

Expand your limits and create a new future together at RBC. Find out how we use our passion and drive to enhance the well-being of our clients and communities



  • Toronto, Ontario, Canada RBC Full time $90,000 - $120,000 per year

    Job DescriptionWhat is the opportunity?Join our Commercial, Core Banking and Payments Technology (CCBPT) team as a Senior Site Reliability Engineer, where you'll play a key role in supporting our cloud and distributed environments for the Personal Commercial Credit SRE & Ops team. This exciting opportunity will challenge you to work with cutting-edge...


  • Toronto, Ontario, Canada Procom Full time $80,000 - $120,000 per year

    Site Reliability Engineer (SRE)/ Ingénieur Fiabilité des SitesOn behalf of our banking client, Procom is seeking a Site Reliability Engineer (SRE) for a 12-month contract role. This position is a hybrid role, 3 days a week onsite at our client's Montréal, Quebec office.Site Reliability Engineer - Job Description:The Site Reliability Engineer is...


  • Toronto, Ontario, Canada 3cf5cb8c-b08d-42c2-a6cd-1ee0c7026e02 Full time $120,000 - $180,000 per year

    About Us:Zensurance is redefining commercial insurance for Canadian businesses.As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive.Zensurance has...


  • Toronto, Ontario, Canada Zensurance Full time $120,000 - $180,000 per year

    About Us: Zensurance is redefining commercial insurance for Canadian businesses.  As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive. Zensurance...


  • Toronto, Ontario, Canada Zensurance Full time $120,000 - $180,000 per year

    About Us: Zensurance is redefining commercial insurance for Canadian businesses As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive Zensurance has...


  • Toronto, Ontario, Canada Zensurance Full time $900,000 - $1,200,000 per year

    About Us:Zensurance is redefining commercial insurance for Canadian businesses. As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive.Zensurance has...


  • Toronto, Ontario, Canada Tubi Full time $120,000 - $180,000 per year

    About Tubi:Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals made for the most passionate fans. Headquartered in San Francisco and founded in 2014,...


  • Toronto, Ontario, Canada Kablamo Full time $90,000 - $120,000 per year

    Reports to: Technical Support ManagerLocation: Toronto (Hybrid)Role Type: Full timeLevel: Intermediate/MidIntroductionKablamo is a fast-growing cloud digital product development company. Founded in 2017 in Australia, the business has grown quickly over the last several years, including the expansion of the team to Canada in 2021. We are proud to have...


  • Toronto, Ontario, Canada Maneva Full time US$80,000 - US$120,000 per year

    About ManevaManeva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....


  • Toronto, Ontario, Canada McCain Foods Full time $102,700 - $137,000 per year

    Position Title:Site Reliability EngineerPosition Type:Regular - Full-TimePosition Location:Toronto HQRequisition ID:36904Our Global Technology team's goal is to leverage technology and data to drive profitable growth, focus on enhancing customer experience and to further our purpose of 'Celebrating real connections through delicious, planet-friendly food'....