Senior Site Reliability Engineer

3 months ago


Old Toronto, Canada Manulife Insurance Malaysia Full time
Senior Site Reliability Engineer

Job Description
Do you want to be part of a team that redefines how we get work done? We are changing the way we develop, and we want you to be part of it We are seeking a self-motivated Senior Site Reliability Engineer in our Identity and Access Management space, who is obsessed with delivering value, is forward-thinking, and excited to see the successful implementation of the products delivered.

As the Senior Site Reliability Engineer, you will:

  • Provide hands-on SRE leadership, run the production environment by monitoring availability and taking a comprehensive view of system health

  • Build software and systems to manage platform infrastructure and applications

  • Drive transformation by automating existing processes, improving reliability, quality, and time-to-market of our suite of software solutions

  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement

  • Provide primary operational support and engineering for multiple large-scale distributed software applications

  • Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding

  • Lead technology currency, drive innovation with introducing automation (server patching, certificate management, compliance, etc.)

  • Create sustainable systems and services through automation and uplifts, set vision for SRE practice including but not limited to monitoring, alerts, self-healing, reliability testing, chaos engineering

  • Partner with development teams to improve services through rigorous testing and release procedures

  • Participate in system design consulting, platform management, and capacity planning

  • Balance feature development speed and reliability with well-defined service-level objectives

You will bring and continuously build upon the following skills:

  • 5+ years SRE experience working in or leading complex enterprise implementations

  • An entrepreneurial spirit and comfort working within a rapidly changing startup environment; you love the challenge of working on a small team and being part of a larger movement to change the engineering culture of an enterprise

  • Advanced knowledge of the following SRE practices:

  • Shell scripting

  • GitOps, Jenkins, Terraform

  • Azure, Azure Kubernetes, Linux, Docker

  • Dynatrace, New Relic, Prometheus, Grafana, Azure Monitor

  • Chaos Engineering

  • Strong Kubernetes and AKS exposure

  • Azure Automation (Nice to have)

  • Familiarity with agile and DevOps principles, test-driven development, continuous integration, and other Software Engineering best practices to accelerate the delivery and quality of new features

  • Eagerness to learn emerging technologies and understanding how they will impact what comes next

  • A capacity for constant learning from both success and failure, remaining open to change and continuous improvement

  • Excellent organizational and problem-solving abilities that enable you to manage through the creative process

  • Strong verbal and written communication with the ability to effectively articulate and communicate technical vision, possibilities, and outcomes to engineering leadership

#J-18808-Ljbffr

  • Old Toronto, Ontario, Canada Northbridge Financial Corporation Full time

    Senior Site Reliability EngineerAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Northbridge Financial Corporation. As a key member of our engineering team, you will be responsible for designing, developing, and implementing site reliability solutions that align with our business goals.Key...


  • Old Toronto, Ontario, Canada Northbridge Financial Corporation Full time

    Senior Site Reliability EngineerAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Northbridge Financial Corporation. As a key member of our engineering team, you will be responsible for designing, developing, and implementing site reliability solutions that align with our business goals.Key...


  • Old Toronto, Ontario, Canada Northbridge Financial Corporation Full time

    Senior Site Reliability EngineerAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Northbridge Financial Corporation. As a key member of our engineering team, you will be responsible for designing, developing, and implementing site reliability solutions that align with our business goals.Key...


  • Old Toronto, Ontario, Canada Northbridge Financial Corporation Full time

    Senior Site Reliability EngineerAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Northbridge Financial Corporation. As a key member of our engineering team, you will be responsible for designing, developing, and implementing site reliability solutions that align with our business goals.Key...


  • Old Toronto, Canada Northbridge Financial Corporation Full time

    What is it like to be a Senior Site Reliability Engineer at Northbridge Financial? The Senior Site Reliability Engineer oversees the creation and implementation of Service Level Objectives (SLOs). The Senior SRE handles service reliability solutions and processes of increasing complexity and is responsible for mentoring and leading less experienced SREs. We...


  • Old Toronto, Canada Northbridge Financial Corporation Full time

    What is it like to be a Senior Site Reliability Engineer at Northbridge Financial? The Senior Site Reliability Engineer oversees the creation and implementation of Service Level Objectives (SLOs). The Senior SRE handles service reliability solutions and processes of increasing complexity and is responsible for mentoring and leading less experienced SREs. We...


  • Old Toronto, Ontario, Canada Manulife Insurance Malaysia Full time

    Senior Site Reliability EngineerAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Identity and Access Management team at Manulife Insurance Malaysia. As a key member of our engineering team, you will play a crucial role in ensuring the reliability, scalability, and performance of our software solutions.Key...


  • Old Toronto, Ontario, Canada Manulife Insurance Malaysia Full time

    Senior Site Reliability EngineerAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Identity and Access Management team at Manulife Insurance Malaysia. As a key member of our engineering team, you will play a crucial role in ensuring the reliability, scalability, and performance of our software solutions.Key...


  • Old Toronto, Ontario, Canada Manulife Insurance Malaysia Full time

    Senior Site Reliability EngineerAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Identity and Access Management team. As a key member of our engineering team, you will be responsible for ensuring the reliability, scalability, and performance of our software solutions.Key ResponsibilitiesLead the development and...


  • Old Toronto, Ontario, Canada Manulife Insurance Malaysia Full time

    Senior Site Reliability EngineerAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Identity and Access Management team. As a key member of our engineering team, you will be responsible for ensuring the reliability, scalability, and performance of our software solutions.Key ResponsibilitiesLead the development and...


  • Old Toronto, Canada GlossGenius Full time

    About GlossGenius: GlossGenius is building an ecosystem enabling entrepreneurs to succeed. We empower small business owners to focus on being creators, not admins, by offering a range of business management tools including booking and scheduling, marketing, analytics, payment processing, and much more. Over 75,000 small business owners have chosen to rely on...


  • Old Toronto, Canada GlossGenius Full time

    About GlossGenius: GlossGenius is building an ecosystem enabling entrepreneurs to succeed. We empower small business owners to focus on being creators, not admins, by offering a range of business management tools including booking and scheduling, marketing, analytics, payment processing, and much more. Over 75,000 small business owners have chosen to rely on...


  • Old Toronto, Ontario, Canada Etraveli Group Full time

    About Etraveli GroupWe are a dynamic and growing company in the travel tech industry, revolutionizing the way people travel. Our innovative virtual interlining technology provides access to billions of travel itineraries by combining flights from different airline carriers. We strive to deliver exceptional customer experiences while providing higher margin...


  • Old Toronto, Ontario, Canada Etraveli Group Full time

    About Etraveli GroupWe are a dynamic and growing company in the travel tech industry, revolutionizing the way people travel. Our innovative virtual interlining technology provides access to billions of travel itineraries by combining flights from different airline carriers. We strive to deliver exceptional customer experiences while providing higher margin...


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    About the Role:We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Thomson Reuters. As a Senior Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our applications and infrastructure. You will work closely with cross-functional teams to identify and resolve issues,...


  • Old Toronto, Ontario, Canada Thomson Reuters Full time

    About the Role:We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Thomson Reuters. As a Senior Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our applications and infrastructure. You will work closely with cross-functional teams to identify and resolve issues,...


  • Old Toronto, Ontario, Canada PagerDuty Full time

    About the RolePagerDuty is seeking a highly skilled Senior Site Reliability Engineer to join our SRE-Platform team. As a key contributor, you will be responsible for building, maintaining, and scaling the Kubernetes platform that powers PagerDuty.Key ResponsibilitiesTriage and troubleshoot production issues, monitor system capacity, and ensure adherence to...


  • Old Toronto, Ontario, Canada PagerDuty Full time

    About the RolePagerDuty is seeking a highly skilled Senior Site Reliability Engineer to join our SRE-Platform team. As a key contributor, you will be responsible for building, maintaining, and scaling the Kubernetes platform that powers PagerDuty.Key ResponsibilitiesTriage and troubleshoot production issues, monitor system capacity, and ensure adherence to...


  • Old Toronto, Canada Jobber Full time

    At Jobber, we don’t just build a product - we work on real problems that help people in small businesses to become successful. We are inspired by our company values: be humble, be supportive and give a shit, which is not just said but is lived. We work in a collaborative environment where teams make decisions with autonomy and contribute directly to...


  • Old Toronto, Canada Jobber Full time

    At Jobber, we don’t just build a product - we work on real problems that help people in small businesses to become successful. We are inspired by our company values: be humble, be supportive and give a shit, which is not just said but is lived. We work in a collaborative environment where teams make decisions with autonomy and contribute directly to...