Site Reliability Engineer

1 week ago


Montreal, Quebec, Canada SAP SE Full time
About the Role

We are seeking a highly skilled Site Reliability Engineer to join our team at SAP SE. As a key member of our Cloud Infrastructure team, you will play a critical role in ensuring the reliability and performance of our cloud-based services.

Key Responsibilities
  • Act as a technical expert during live site incidents, investigating and resolving issues on a deep technical level.
  • Drive root cause analysis and implement improvements to prevent issues from reoccurring.
  • Perform in-depth troubleshooting and log analysis to identify and resolve complex issues in accordance with internal and external SLAs.
  • Design and implement software-based solutions to enhance service reliability and stability.
  • Enhance infrastructure and platform monitoring by gathering system metrics and implementing tools for recovery.
  • Collaborate closely with development teams to integrate and implement outputs from postmortems and product improvements.
  • Stay up-to-date with the latest development increments and technologies.
  • Develop and maintain technical documentation.
  • Advocate and apply SRE best practices.
  • Participate in the on-call rotation to react to major incidents.
Requirements
  • Experience with Kubernetes and a good understanding of container technologies.
  • Understanding of modern cloud architectures, with experience with Cloud Platforms such as AWS, Azure, or GCP.
  • Scripting skills, CI/CD (Concourse, GitHub Actions, or ArgoCD) - enthusiasm for automation.
  • Ability to work efficiently in emergency situations and analyze problems in a global team setup.
  • Excellent team player, passionate about work, self-motivated, and driven.
  • Excellent communication skills - precise and fact-based.
  • Fluency in English, basic French.
Preferred Qualifications
  • Coding experience with Python, Bash, or GO.
  • CKA/CKAD/CKS certifications.
  • Experience with Unix/Linux operating systems.
  • Experience with modern monitoring, logging, and alerting tools (Grafana, Prometheus, Kibana, Loki, Splunk On-Call, or Dynatrace).
  • Security best practices for application development and operations in a public Cloud Environment.
  • Contribution to open-source projects.
About SAP SE

SAP SE is a global leader in enterprise software and software-related services. Our innovations help more than four hundred thousand customers worldwide work together more efficiently and use business insight more effectively. We are a cloud company with two hundred million users and more than one hundred thousand employees worldwide, driven by a purpose to help the world run better and improve people's lives.

Our Culture

SAP's culture of inclusion, focus on health and well-being, and flexible working models help ensure that everyone – regardless of background – feels included and can run at their best. We believe we are made stronger by the unique capabilities and qualities that each person brings to our company, and we invest in our employees to inspire confidence and help everyone realize their full potential.



  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title:Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will play a critical role in driving the reliability engineering, operations, and customer support services for our ServiceNow SaaS implementation.Key Responsibilities:Deliver...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title:Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will play a critical role in driving the reliability engineering, operations, and customer support services for our ServiceNow SaaS implementation.Key Responsibilities:Deliver...


  • Montreal, Quebec, Canada Lyft Full time

    Lyft is a leading micromobility company dedicated to improving urban transportation systems worldwide. They are seeking a Site Reliability Engineer to join their expanding team. This individual will be responsible for designing, implementing, and maintaining the infrastructure systems to ensure reliability and scalability.Responsibilities:Assist in defining...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"title": "Site Reliability Engineer", "description": "About the RoleWe are seeking a skilled Site Reliability Engineer to join our HashiVault squad. As an SRE, you will be responsible for implementing new features, dealing with user requests, and reducing repeatable tasks to allow more time for strategic initiatives.About the Company*** is a leading global...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"title": "Site Reliability Engineer", "description": "About the RoleWe are seeking a skilled Site Reliability Engineer to join our HashiVault squad. As an SRE, you will be responsible for implementing new features, dealing with user requests, and reducing repeatable tasks to allow more time for strategic initiatives.About the Company*** is a leading global...


  • Montreal, Quebec, Canada Lyft Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Lyft in Montreal. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our production systems, platforms, and tools.Key ResponsibilitiesDefine the team's roadmap and architecture based on technological and...


  • Montreal, Quebec, Canada Lyft Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Lyft in Montreal. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our production systems, platforms, and tools.Key ResponsibilitiesDefine the team's roadmap and architecture based on technological and...


  • Montreal, Quebec, Canada Lyft Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Lyft in Montreal. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our production systems, platforms, and tools.Key ResponsibilitiesDefine the team's roadmap and architecture based on technological and...


  • Montreal, Quebec, Canada Lyft Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Lyft in Montreal. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our production systems, platforms, and tools.Key ResponsibilitiesDefine the team's roadmap and architecture based on technological and...


  • Montreal, Quebec, Canada SAP Full time

    About the RoleWe are seeking a skilled Site Reliability Engineer to join our team at SAP. As a key member of our Reliability Engineering organization, you will play a critical role in ensuring the smooth operation of our business-critical Cloud services.Key ResponsibilitiesAct as a technical expert during live site incidents, investigating and resolving...


  • Montreal, Quebec, Canada SAP Full time

    About the RoleWe are seeking a skilled Site Reliability Engineer to join our team at SAP. As a key member of our Reliability Engineering organization, you will play a critical role in ensuring the smooth operation of our business-critical Cloud services.Key ResponsibilitiesAct as a technical expert during live site incidents, investigating and resolving...


  • Montreal, Quebec, Canada NBC Full time

    Job Title: Site Reliability SpecialistWe are seeking a highly skilled Site Reliability Specialist to join our team at NBC. As a key member of our DevOps squad, you will play a critical role in innovating, developing, and maintaining the availability and reliability of our assets in production.Key Responsibilities:Be the primary point of contact between...


  • Montreal, Quebec, Canada NBC Full time

    Job Title: Site Reliability SpecialistWe are seeking a highly skilled Site Reliability Specialist to join our team at NBC. As a key member of our DevOps squad, you will play a critical role in innovating, developing, and maintaining the availability and reliability of our assets in production.Key Responsibilities:Be the primary point of contact between...


  • Montreal, Quebec, Canada NBC Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at NBC. As a key member of our DevOps squad, you will play a critical role in innovating and developing new ways to integrate data into our assets, ensuring their availability and reliability in production.Key ResponsibilitiesBe the primary point of contact between...


  • Montreal, Quebec, Canada NBC Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at NBC. As a key member of our DevOps squad, you will play a critical role in innovating and developing new ways to integrate data into our assets, ensuring their availability and reliability in production.Key ResponsibilitiesBe the primary point of contact between...


  • Montreal, Quebec, Canada NBC Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at NBC. As a key member of our DevOps squad, you will play a critical role in innovating and developing new ways to integrate data into our assets, ensuring their availability and reliability in production.Key ResponsibilitiesBe the primary point of contact between...


  • Montreal, Quebec, Canada NBC Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at NBC. As a key member of our DevOps squad, you will play a critical role in innovating and developing new ways to integrate data into our assets, ensuring their availability and reliability in production.Key ResponsibilitiesBe the primary point of contact between...


  • Montreal, Quebec, Canada Lightspeed Full time

    We are seeking a Senior Site Reliability Engineer to become a vital part of our team at Lightspeed. Our organization specializes in developing innovative software solutions that empower merchants to enhance their business growth and profitability. In this role, you will collaborate with a team focused on addressing critical areas such as cloud...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title: Site Reliability Engineer (SRE), ServiceNow, Application InfrastructureAt Axelon Services Corporation, we are seeking a highly skilled Site Reliability Engineer (SRE) to join our Application Infrastructure team. As an SRE, you will play a critical role in driving the reliability engineering, operations, and customer support services for our...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title: Site Reliability Engineer (SRE), ServiceNow, Application InfrastructureAt Axelon Services Corporation, we are seeking a highly skilled Site Reliability Engineer (SRE) to join our Application Infrastructure team. As an SRE, you will play a critical role in driving the reliability engineering, operations, and customer support services for our...