Senior Site Reliability Engineer

2 weeks ago


Mississauga, Canada RBC - Royal Bank Full time

Job Summary

Job Description

What is the opportunity?

RBC Insurance Technology is seeking to hire a Senior Site Reliability Engineer for its Insurance Technology Platform Support team. The Insurance Technology Platform Support Team is a specialized unit dedicated to ensuring the optimal performance, availability, and resilience of IT applications used in the insurance line of business. With a unique blend of technical expertise and industry-specific knowledge, this team plays a critical role in ensuring the seamless operations of digital services that cater to both the business's internal and external stakeholders.

As a Senior Site Reliability Engineer, you will bring the engineering mindset of bold ambition, curiosity and outcome focus to ensuring the performance and reliability of our systems. This role calls for a dynamic individual who excels in a collaborative environment, interacting with cross-functional teams to establish best practices for observability, monitoring, logging, alerting, and automation. This role will be responsible for the development, implementation, and support of Site Reliability Engineering (SRE) solutions for applications supported by RBC Insurance Technology. You'll leverage your proficiency in Elasticsearch, Ansible, GitHub Actions, Moogsoft, PagerDuty, Dynatrace and scripting languages to build and maintain robust automation and SRE tooling.

What will you do?

  • Set vision for SRE product base (monitoring, alerting, machine learning anomaly detection, self-healing, reliability testing)

  • Lead cross-functional collaborations to define and implement best practices for monitoring, logging, and incident response, driving a proactive stance on system health.

  • Implement and manage automation processes with Ansible and GitHub Actions to streamline operational tasks.

  • Develop and maintain custom tooling and automation scripts in languages like Bash, Python, and PowerShell to enhance operational efficiency and system reliability.

  • Work closely with development teams to understand code changes and their impact on the production environment, ensuring that new releases meet our reliability standards.

  • Actively contribute to the definition and tracking of SLIs, SLOs, and other critical metrics, refining our alerting and monitoring strategies accordingly.

  • Document and maintain comprehensive runbooks, facilitating quick resolution of incidents and reducing mean time to recovery (MTTR).

  • Create and refine custom tooling and automation scripts using languages such as Bash, Python, and PowerShell, supporting the infrastructure's scalability and reliability needs.

  • Guide the technical direction for future deployments, advocating for reliability and performance improvements based on industry trends and company objectives.

  • Mentor team members in building out robust monitoring and alerting strategies based on well-defined SLIs and SLOs.

  • Act as portfolio SME (Subject Matter Expert) - understand & document common components, core functionalities, infrastructure of supported applications.

  • Lead in incident management and problem management for applications in scope and RCA Action items fulfillment/ownership.

  • Drive transformation by continuously looking for ways to automate existing processes.

  • Debug production issues across services and levels of the stack and provide primary operational support.

  • Perform production support role, including off-hours support (As part of an oncall rotation)

Must-have:

  • 4+ years of SRE or Systems Engineering experience with a proven record in technical leadership.

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience.

  • Expertise in infrastructure-as-code and configuration management, particularly Ansible.

  • Advanced scripting capabilities in Bash, Python, PowerShell, or other similar languages.

  • In-depth knowledge of tools such as Elasticsearch, Ansible, GitHub, OpenShift, Kubernetes, Dynatrace, Kafka, and their role in system reliability.

  • Knowledge of creating, maintaining, and alerting on SLIs, SLOs, and other reliability metrics.

Nice-to-have:

  • Insurance industry experience

  • In-depth hands-on experience in a variety of SRE tools (Azure Automation, Catchpoint, Prometheus, Splunk, Grafana)

  • Familiarity with containerization technologies such as Docker.

  • Hands-on experience with DevOps CI-CD tools e.g. Jenkins, Artifactory and Vault

Soft Skills:

  • Excellent communication skills to foster collaboration across departments.

  • A resilient problem-solving approach, capable of leading the charge during high-stress incidents.

  • Strategic thinking and analytical prowess, with a focus on delivering reliable and performant systems.

  • Organizational skills to manage multiple priorities in a fast-paced environment.

RBC is committed to supporting flexible work arrangements when and where available. Details to be discussed with Hiring Manager.

What's in it for you?

We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.

  • A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable

  • Leaders who support your development through coaching and managing opportunities

  • Ability to make a difference and lasting impact

  • Work in a dynamic, collaborative, progressive, and high-performing team

  • A world-class training program in financial services

  • Flexible work/life balance options

  • Opportunities to do challenging work

Job Skills

Agile Methodology, Application Infrastructure, Group Problem Solving, IT Automation, IT Monitoring, Operations Support, Production Support, Software Development Life Cycle (SDLC), Software Engineering, Software Product Technical Knowledge, System Applications, Systems Software

Additional Job Details

Address:

MEADOWVALE BUSINESS PARK, 6880 FINANCIAL DR:MISSISSAUGA

City:

MISSISSAUGA

Country:

Canada

Work hours/week:

37.5

Employment Type:

Full time

Platform:

Technology and Operations

Job Type:

Regular

Pay Type:

Salaried

Posted Date:

2024-05-03

Application Deadline:

2024-05-17

Inclusion and Equal Opportunity Employment

At RBC, we embrace diversity and inclusion for innovation and growth. We are committed to building inclusive teams and an equitable workplace for our employees to bring their true selves to work. We are taking actions to tackle issues of inequity and systemic bias to support our diverse talent, clients and communities.
​​​​​​​
We also strive to provide an accessible candidate experience for our prospective employees with different abilities. Please let us know if you need any accommodations during the recruitment process.

Join our Talent Community

Stay in-the-know about great career opportunities at RBC. Sign up and get customized info on our latest jobs, career tips and Recruitment events that matter to you.

Expand your limits and create a new future together at RBC. Find out how we use our passion and drive to enhance the well-being of our clients and communities at jobs.rbc.com.



  • Mississauga, Canada Mimecast Canada Limited Full time

    Senior Site Reliability Engineer page is loaded Senior Site Reliability Engineer Apply locations Canada - Mississauga - Remote time type Full time posted on Posted 4 Days Ago job requisition id R4613 Senior Site Reliability EngineerHelp Build the Next Generation of Cloud-Scalable AI-Based Security ProductsHave a passion...


  • Mississauga, ON, Canada Mimecast Canada Limited Full time

    Senior Site Reliability Engineer page is loaded Senior Site Reliability Engineer Apply locations Canada - Mississauga - Remote time type Full time posted on Posted 4 Days Ago job requisition id R4613 Senior Site Reliability Engineer Help Build the Next Generation of Cloud-Scalable AI-Based Security Products Have a passion for software security? Excel...


  • Mississauga, ON, Canada Mimecast Canada Limited Full time

    Senior Site Reliability Engineer page is loaded Senior Site Reliability Engineer Apply locations Canada - Mississauga - Remote time type Full time posted on Posted 4 Days Ago job requisition id R4613 Senior Site Reliability Engineer Help Build the Next Generation of Cloud-Scalable AI-Based Security Products Have a passion for software security? Excel...


  • Mississauga, Canada Mimecast Full time

    Senior Site Reliability Engineer Help Build the Next Generation of Cloud-Scalable AI-Based Security Products Have a passion for software security? Excel at implementing public cloud at scale? Desire to apply Machine Learning to solve complex problems? This may well be the role for you. Our Communication and Collaboration Security products are cutting-edge...


  • Mississauga, Canada Mimecast Full time

    Senior Site Reliability EngineerHelp Build the Next Generation of Cloud-Scalable AI-Based Security ProductsHave a passion for software security? Excel at implementing public cloud at scale? Desire to apply Machine Learning to solve complex problems?  This may well be the role for you.  Our Communication and Collaboration Security products are cutting edge...


  • Mississauga, Canada Royal Bank of Canada> Full time

    Job SummaryJob DescriptionWhat is the opportunity? RBC Insurance Technology is seeking to hire a Senior Site Reliability Engineer for its Insurance Technology Platform Support team. The Insurance Technology Platform Support Team is a specialized unit dedicated to ensuring the optimal performance, availability, and resilience of IT applications used in the...


  • MISSISSAUGA, Canada Royal Bank of Canada Full time

    Job SummaryJob DescriptionWhat is the opportunity? RBC Insurance Technology is seeking to hire a Senior Site Reliability Engineer for its Insurance Technology Platform Support team. The Insurance Technology Platform Support Team is a specialized unit dedicated to ensuring the optimal performance, availability, and resilience of IT applications used in the...


  • Mississauga, ON, Canada Mimecast Full time

    Senior Site Reliability Engineer Help Build the Next Generation of Cloud-Scalable AI-Based Security Products Have a passion for software security? Excel at implementing public cloud at scale? Desire to apply Machine Learning to solve complex problems? This may well be the role for you. Our Communication and Collaboration Security products are cutting-edge...


  • Mississauga, ON, Canada Mimecast Full time

    Senior Site Reliability Engineer Help Build the Next Generation of Cloud-Scalable AI-Based Security Products Have a passion for software security? Excel at implementing public cloud at scale? Desire to apply Machine Learning to solve complex problems? This may well be the role for you. Our Communication and Collaboration Security products are cutting-edge...


  • Mississauga, Ontario, Canada Abbott Laboratories Full time

    About AbbottAbbott is a global healthcare leader, creating breakthrough science to improve people's health. We're always looking towards the future, anticipating changes in medical science and technology. Working at Abbott At Abbott, you can do work that matters, grow, and learn, care for yourself and family, be your true self and live a full life. You will...


  • Mississauga, Canada Roche Full time

    The Position Senior Site Reliability Engineer (Kubernetes Platform) - Digital Products and Enablement The 21st century needs a 21st century healthcare system. To help build this, Roche is not only developing highly personalized medicine and advanced diagnostics, but also heavily investing into software and digital solutions. To speed up medical processes,...


  • Mississauga, Canada Roche Full time

    The Position Senior Site Reliability Engineer (Kubernetes Platform) - Digital Products and Enablement The 21st century needs a 21st century healthcare system. To help build this, Roche is not only developing highly personalized medicine and advanced diagnostics, but also heavily investing into software and digital solutions. To speed up medical...


  • Mississauga, ON, Canada Roche Full time

    The Position Senior Site Reliability Engineer (Kubernetes Platform) - Digital Products and Enablement The 21st century needs a 21st century healthcare system. To help build this, Roche is not only developing highly personalized medicine and advanced diagnostics, but also heavily investing into software and digital solutions. To speed up medical processes,...


  • Mississauga, Canada Randstad Digital Full time

    Site Reliability Engineer - SRE (Contract Position) Number of Positions: 1 Filled: 0 Duration: 6 months Location: Mississauga, ON, CA Must be eligible to work in Canada This is a contract to hire position, 6months contract then FT Perm Hybrid role 2-3days/week onsite mandatory -The candidate must have a development (any) background. - Part of SRE...


  • Mississauga, Canada Randstad Digital Full time

    Site Reliability Engineer - SRE (Contract Position)Number of Positions: 1 Filled: 0 Duration: 6 monthsLocation: Mississauga, ON, CAMust be eligible to work in CanadaThis is a contract to hire position, 6months contract then FT PermHybrid role 2-3days/week onsite mandatory -The candidate must have a development (any) background.- Part of SRE team, 3 other and...


  • Mississauga, Canada Randstad Digital Full time

    Site Reliability Engineer - SRE (Contract Position)Number of Positions: 1 Filled: 0 Duration: 6 monthsLocation: Mississauga, ON, CAMust be eligible to work in CanadaThis is a contract to hire position, 6months contract then FT PermHybrid role 2-3days/week onsite mandatory -The candidate must have a development (any) background.- Part of SRE team, 3 other and...


  • Mississauga, Ontario, Canada Thermo Fisher Scientific Full time

    Job DescriptionAs part of the Thermo Fisher Scientific team, you'll discover meaningful work that makes a positive impact on a global scale. Join our colleagues in bringing our Mission to life every single day to enable our customers to make the world healthier, cleaner and safer. We provide our global teams with the resources needed to achieve individual...


  • Mississauga, Ontario, Canada Thermo Fisher Scientific Full time

    Job DescriptionThis Co-Op position is a minimum of 12 months and will run from May 2024 through April 2025Summary:The main focus of this position is to provide support for the Engineering department.Essential Functions:Researches, develops and implements processes and procedures necessary to establish a Reliability Centered Maintenance(RCM) culture to...


  • Mississauga, ON, Canada Lycopodium Limited Full time

    With offices in Australia, Canada, Africa, Peru and the Philippines, Lycopodium proudly delivers high quality professional engineering and project delivery services globally, across the resources, infrastructure and industrial processes sectors. LycopodiumCanada is currently recruiting for anexperienced FIFO Senior Project Engineer to enable an EPCM gold...

  • Asset Management

    4 days ago


    Mississauga, Canada Maple Leaf Foods Full time

    **The Opportunity**: Reporting to the Director, Asset Management and Reliability, this position is within the Asset Reliability Group (ARG). The ARG sets strategy and direction for reliability, asset management, and maintenance for all Maple Leaf Foods manufacturing facilities (network of 25+ sites), and precisely executes on a roadmap to move the...