Site Reliability Engineering Manager

4 weeks ago


Toronto, Ontario, Canada Index Exchange Full time

About Index Exchange:

We are a leading technology company that has shaped the earliest forms of ad tech, and we're looking for a technical expert to help shape its future. Our customers have unique problems that can only be solved at internet scale, and that's where the technical skills of our team make a real difference.

Our exchange handles more than 450 billion requests every day, all running in our own global data centers. Every member of our technology team has an enormous amount of autonomy in building and managing our systems to support and enable our growing level of scale. Through the transparency of our technology, dedication to innovation and integrity, and long-standing customer relationships, we lead through change.

What's it like to work at Index Exchange?

We have more than 600 Indexers around the globe dedicated to building a safe and transparent marketplace that provides a trusted experience for consumers.

Index Exchange is an exciting and fast-paced place to work. We're built on our values of change, support, learning and teaching, trust, and intention. We pride ourselves on our independence and openness, not only in our technology, but in our teams, too. Our diverse and inclusive culture celebrates how we can leverage our unique differences to help drive Index Exchange forward.

About The Role

We are seeking an experienced Engineering Manager with a strong background in Site Reliability Engineering (SRE) to lead and develop a high-performance team of engineers. The ideal candidate will have a deep technical understanding of on-premise and hybrid cloud environments and a proven track record of managing SRE teams in a global setting.

Key Responsibilities:

  • Build and lead a world-class SRE team, fostering a culture of innovation, collaboration, and accountability.
  • Drive operational excellence through proactive monitoring, automation, and the development of robust incident management processes.
  • Collaborate with software engineering teams to implement SRE best practices in the software development life cycle.
  • Lead incident response efforts, ensuring rapid resolution and post-incident analysis to prevent recurrence.
  • Develop and maintain meaningful performance metrics and reporting mechanisms to track the health and reliability of our systems.

Requirements:

  • Proven experience (6+ years) in SRE roles, with a focus on low-latency, global-scale environments built on upstream Kubernetes.
  • Strong software engineering skills, including proficiency in programming languages such as Goland, Python, Perl.
  • Excellent understanding of on-premise and hybrid cloud architectures.
  • Exceptional leadership and team-building skills with a track record of developing high-performing teams.
  • Expertise in incident management, root cause analysis, and post-incident reviews.

Why You'll Love Working Here:

  • Comprehensive health, dental, and vision plans at no cost to you.
  • Time off and flexible work schedules.
  • Retirement plan with a 5% company match.
  • Stock options and equity packages.
  • Generous parental leave.
  • Monthly wellness stipend plus fitness discounts and quarterly wellness group activities.
  • Home office stipend.
  • Community engagement opportunities and donation-matching program.
  • Annual virtual company retreats and regular community-led team events.

Equal Employment Opportunity:

At Index Exchange, we believe that successful products are built by teams just as diverse as the audience who uses them. As such, we are committed to equal employment opportunities. We celebrate diversity of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or expression, or veteran status.



  • Toronto, Ontario, Canada The Home Depot Canada Full time

    Unlock Your Potential at The Home Depot CanadaAs a Site Reliability Engineering Manager, you will lead a team of Site Reliability Engineers to ensure the reliability, performance, and operational support of our eCommerce systems, with a focus on Google Cloud Platform (GCP) environments.Key Responsibilities:Lead and mentor a team of Site Reliability Engineers...


  • Toronto, Ontario, Canada SGS Full time

    Job DescriptionThe Site Reliability Engineer will play a critical role in ensuring the reliability, supportability, scalability, and performance of our .NET stack applications built with MVC, Angular, and Web API.Partner with developers and product operations teams to understand application requirements and translate them into operational practices.Design,...


  • Toronto, Ontario, Canada SGS Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at SGS Canada. As a key member of our infrastructure team, you will play a critical role in ensuring the reliability, supportability, scalability, and performance of our .NET stack applications.Key Responsibilities:Partner with developers and...


  • Toronto, Ontario, Canada Northbridge Financial Corporation Full time

    Senior Site Reliability EngineerThe Senior Site Reliability Engineer plays a crucial role in ensuring the reliability and efficiency of our systems. This position oversees the creation and implementation of Service Level Objectives (SLOs) and handles service reliability solutions and processes of increasing complexity.Key Responsibilities:Interface with...


  • Toronto, Ontario, Canada The Toronto-Dominion Bank (Canada) Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at The Toronto-Dominion Bank (Canada). As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and applications.Key ResponsibilitiesProvide technical leadership and expertise in designing and...


  • Toronto, Ontario, Canada Lyons Consulting Group Full time

    Job SummaryLyons Consulting Group is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our infrastructure and applications.Key ResponsibilitiesProvide hands-on SRE support, including incident management, problem management, root cause...


  • Toronto, Ontario, Canada Vantage Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems.Key ResponsibilitiesCollaborate with software engineers to drive project success and...


  • Toronto, Ontario, Canada Sentry Full time

    About SentryWe're on a mission to help developers write better software faster, so we can get back to enjoying technology.With more than $217 million in funding and 100,000+ organizations that believe we're on to something, we're building performance and error monitoring tools that help companies like Disney, Microsoft, and Atlassian spend less time fixing...


  • Toronto, Ontario, Canada Royal Bank of Canada> Full time

    Job SummaryJob DescriptionWhat is the Opportunity?Royal Bank of Canada is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our Site Reliability Engineering team, you will be responsible for designing, building, and managing complex platforms to support business processes, reduce toil, and develop new technology...


  • Toronto, Ontario, Canada Sentry Full time

    About SentryWe're on a mission to help developers write better software faster, so we can get back to enjoying technology.With more than $217 million in funding and 100,000+ organizations that believe we're on to something, we're building performance and error monitoring tools that help companies like Disney, Microsoft, and Atlassian spend less time fixing...


  • Toronto, Ontario, Canada Criteo Full time

    About the Role:We are seeking a skilled Senior Site Reliability Engineer to join our team at Criteo. As a key member of our Product Reliability Engineering group, you will work closely with product engineering to improve the reliability of our apps, systems, and pipelines.Your Responsibilities:Collaborate with product engineering to identify and prioritize...


  • Toronto, Ontario, Canada Criteo Full time

    About the Role:We are seeking a skilled Site Reliability Engineer to join our team at Criteo. As a Site Reliability Engineer, you will work closely with product engineering to improve the reliability of our apps, systems, and pipelines.Key Responsibilities:Collaborate with product engineering to design, develop, and deploy scalable and reliable systems.Work...


  • Toronto, Ontario, Canada Compunnel Inc. Full time

    Compunnel Inc. is a leading provider of innovative technology solutions.We are seeking an experienced Site Reliability Engineering Lead to join our team in Toronto, Canada.The estimated salary for this position is $170,000 per year, considering the location and industry standards.About the JobThis role is perfect for someone who is passionate about driving...


  • Toronto, Ontario, Canada Vantage Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems.Key Responsibilities:Collaborate with software engineers to drive project forward through...


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    Site Reliability Engineer (Contract)Contract (5 months 29 days)Closed OpportunityThomson Reuters is seeking a skilled Site Reliability Engineer to join our Service Management Organization.The ideal candidate will have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure.As a Site Reliability...


  • Toronto, Ontario, Canada Behavox Full time

    About the RoleAt Behavox, we're building a scalable and fault-tolerant platform to manage and analyze massive volumes of data. Our platform is designed to handle millions of data items, allowing our clients to search, filter, and visualize relationships between entities in the system.As a Site Reliability Engineer, you'll be responsible for ensuring the...


  • Toronto, Ontario, Canada KPMG Canada Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at KPMG Canada. As a key member of our Operations team, you will play a critical role in ensuring the smooth operation of our Managed Service.Key ResponsibilitiesDesign and implement scalable and reliable cloud infrastructure solutionsCollaborate with cross-functional...


  • Toronto, Ontario, Canada Northbridge Financial Corporation Full time

    Site Reliability Engineer Role OverviewThe Senior Site Reliability Engineer at Northbridge Financial Corporation is responsible for overseeing the creation and implementation of Service Level Objectives (SLOs). This role involves handling service reliability solutions and processes of increasing complexity, as well as mentoring and leading less experienced...


  • Toronto, Ontario, Canada Estée Lauder Companies Full time

    Reliability Engineering Manager RoleWe are seeking a highly skilled Reliability Engineering Manager to join our team at Estée Lauder Companies. As a key member of the Plant Management Team, you will be responsible for leading maintenance and reliability processes to achieve operational excellence.The ideal candidate will have a strong background in plant...


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineering Specialist to join our team at Thomson Reuters. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining scalable systems and services that meet the needs of our customers.Key ResponsibilitiesDesign and implement scalable systems and...