Current jobs related to Lead Site Reliability Engineer - Old Toronto, Ontario - PagerDuty, Inc.


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Thomson Reuters. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and efficiency of our cloud-based infrastructure.About the RoleIn this position, you will be responsible for:Designing and implementing scalable...


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Thomson Reuters. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and efficiency of our cloud-based infrastructure.About the RoleIn this position, you will be responsible for:Designing and implementing scalable...


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Thomson Reuters. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure.About the RoleIn this role, you will be responsible for:Designing and implementing scalable systems and...


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Thomson Reuters. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure.About the RoleIn this role, you will be responsible for:Designing and implementing scalable systems and...


  • Old Toronto, Ontario, Canada Rogers Communications Full time

    Unlock Your Potential at Rogers Sports & MediaWe're on the lookout for a talented Site Reliability Engineer to join our dynamic team at Rogers Sports & Media. As a key player in our organization, you'll have the opportunity to work on exciting projects and collaborate with a diverse group of professionals who share your passion for innovation and...


  • Old Toronto, Ontario, Canada Rogers Communications Full time

    Unlock Your Potential at Rogers Sports & MediaWe're on the lookout for a talented Site Reliability Engineer to join our dynamic team at Rogers Sports & Media. As a key player in our organization, you'll have the opportunity to work on exciting projects and collaborate with a diverse group of professionals who share your passion for innovation and...


  • Old Toronto, Ontario, Canada Reperio Human Capital Full time

    Site Reliability EngineerWe are seeking an experienced Site Reliability Engineer to join our team at Reperio Human Capital. As a key member of our infrastructure team, you will be responsible for ensuring the reliability and scalability of our production systems.Key Responsibilities:Design and implement monitoring and automation solutions to ensure system...


  • Old Toronto, Ontario, Canada Reperio Human Capital Full time

    Site Reliability EngineerWe are seeking an experienced Site Reliability Engineer to join our team at Reperio Human Capital. As a key member of our infrastructure team, you will be responsible for ensuring the reliability and scalability of our production systems.Key Responsibilities:Design and implement monitoring and automation solutions to ensure system...


  • Old Toronto, Ontario, Canada Reperio Human Capital Full time

    Site Reliability EngineerWe are seeking an experienced Site Reliability Engineer to join our team at Reperio Human Capital. As a key member of our infrastructure team, you will be responsible for ensuring the reliability and scalability of our production systems.Key Responsibilities:Design and implement monitoring and automation solutions to ensure system...


  • Old Toronto, Ontario, Canada Reperio Human Capital Full time

    Site Reliability EngineerWe are seeking an experienced Site Reliability Engineer to join our team at Reperio Human Capital. As a key member of our infrastructure team, you will be responsible for ensuring the reliability and scalability of our production systems.Key Responsibilities:Design and implement monitoring and automation solutions to ensure system...


  • Old Toronto, Ontario, Canada TD Bank Full time

    Job Title: AWS Site Reliability EngineerTD Bank is seeking a highly skilled AWS Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud-based systems using AWS...


  • Old Toronto, Ontario, Canada TD Bank Full time

    Job Title: AWS Site Reliability EngineerTD Bank is seeking a highly skilled AWS Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud-based systems using AWS...


  • Old Toronto, Ontario, Canada TD Bank Full time

    Job Title: AWS Site Reliability EngineerTD Bank is seeking a highly skilled AWS Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud-based systems using AWS...


  • Old Toronto, Ontario, Canada TD Bank Full time

    Job Title: AWS Site Reliability EngineerTD Bank is seeking a highly skilled AWS Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud-based systems using AWS...


  • Toronto, Ontario, Canada Thomson Reuters Full time

    About the RoleThis is an exciting opportunity to join our team as a Lead Site Reliability Engineer at Thomson Reuters. As a key member of our engineering team, you will be responsible for leading and mentoring a team of SREs, providing technical guidance, coaching, and support to foster a culture of collaboration, innovation, and continuous improvement.Key...


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    About the RoleWe are seeking a skilled Site Reliability Engineer to join our team at Thomson Reuters. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining scalable and reliable systems and services.Key Responsibilities:Design and implement scalable systems and servicesDevelop and maintain tools and scripts to...


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    About the RoleWe are seeking a skilled Site Reliability Engineer to join our team at Thomson Reuters. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining scalable and reliable systems and services.Key Responsibilities:Design and implement scalable systems and servicesDevelop and maintain tools and scripts to...


  • Toronto, Ontario, Canada Thomson Reuters Full time

    About the RoleAs a Lead Site Reliability Engineer at Thomson Reuters, you will play a key role in ensuring the reliability and scalability of our cloud-based infrastructure and applications. You will lead a team of SREs, providing technical guidance, coaching, and support to foster a culture of collaboration, innovation, and continuous improvement.Key...


  • Toronto, Ontario, Canada Thomson Reuters Full time

    About the RoleAs a Lead Site Reliability Engineer at Thomson Reuters, you will play a key role in ensuring the reliability and scalability of our cloud-based infrastructure and applications. You will lead a team of SREs, providing technical guidance, coaching, and support to foster a culture of collaboration, innovation, and continuous improvement.Key...


  • Old Toronto, Ontario, Canada Mastech Inc. Full time

    Job Title: Site Reliability Engineer (GCP)Mastech Digital is a leading IT Staffing and Digital Transformation Services company, providing digital and mainstream technology staff to all American Corporations.We are currently seeking a Site Reliability Engineer (GCP) for our client in the Consulting domain. As a Site Reliability Engineer (GCP), you will play a...

Lead Site Reliability Engineer

2 months ago


Old Toronto, Ontario, Canada PagerDuty, Inc. Full time

PagerDuty empowers diverse teams to drive essential operations that propel business growth through the PagerDuty Operations Cloud.

We are in search of a Senior Site Reliability Engineer to become a vital member of our SRE-Platform team. In this capacity, you will play a crucial role in developing, sustaining, and enhancing the Kubernetes infrastructure that underpins PagerDuty. Our focus is on crafting solutions that boost developer efficiency, enhance reliability, and support PagerDuty's growth for the future. If you have a passion for platform engineering, enhancing developer experiences, and mastering Kubernetes, we would be eager to connect with you.

Key Responsibilities

  • Maintain the overall health of the platform by diagnosing and resolving production issues, monitoring system capacity, and collaborating with other technical teams to uphold compliance and security standards.
  • Collaborate with Engineering stakeholders to design and implement a reliable, scalable, secure, and high-performance platform.
  • Continuously seek to enhance the developer experience through full lifecycle support (creation, development, deployment, retirement), observability, flexible connectivity, and monitoring.
  • Share your knowledge and expertise across the entire Engineering organization.
  • Participate in a 24/7 on-call rotation, utilizing PagerDuty to manage on-call schedules.

Basic Qualifications

  • 5+ years of experience in Platform Engineering, Site Reliability Engineering, or DevOps roles.
  • Experience managing multiple Kubernetes clusters in a production setting.
  • Familiarity with cloud-native infrastructure (e.g., AWS, GCP, Azure).
  • Experience deploying web applications on Kubernetes (Helm, ArgoCD).
  • Proficiency in infrastructure as code (i.e., Terraform or CloudFormation).
  • Knowledge of a dynamic programming language (i.e., Ruby or Python).

Preferred Qualifications

  • Experience with monitoring, observability, and logging platforms (e.g., DataDog, New Relic, SumoLogic, Splunk).
  • Familiarity with configuration management systems (e.g., Ansible, Chef, Puppet).
  • Experience in automating releases, continuous integration/delivery systems, and relevant tools (e.g., Jenkins, CircleCI, Travis CI, Buildkite).

The base salary range for this position is 152,000 USD. This role may also be eligible for bonus, commission, equity, and/or benefits.

PagerDuty is dedicated to fostering a diverse environment and is an equal opportunity employer. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, parental status, veteran status, or disability status.