Senior Site Reliability Engineering Specialist

7 days ago


London, Canada SAP SE Full time

Senior Site Reliability Engineering Specialist

We help the world run better At SAP, we keep it simple: you bring your best to us, and we'll bring out the best in you. We're builders touching over 20 industries and 80% of global commerce, and we need your unique talents to help shape what's next. The work is challenging – but it matters. You'll find a place where you can be yourself, prioritize your wellbeing, and truly belong. What's in it for you? Constant learning, skill growth, great benefits, and a team that wants you to grow and succeed.

This is a hybrid role based out of Waterloo. Hybrid is 3 days a week onsite and 2 days a week remote.

As a Senior Site Reliability Engineer inSupply Chain Management (SCM) – Make & Deliver, you will ensure thatSAP Digital ManufacturingandSAP Logistics Managementoperate reliably and efficiently at scale. These solutions support critical manufacturing and logistics processes worldwide, built onSAP BTP, Kubernetes, and multicloud environments. In this role, you act as anEnablement Advocatewithin the organization: partnering with development teams to review architecture for resiliency, enforce reliability guardrails, and integrate observability and performance standards into the design process. Beyond operational excellence, you will also helpdevelop and integrate AIOps toolsfor smarter monitoring and automated remediation, ensuring reliability is embedded across the lifecycle. You’ll contribute to incident response for high severity events and drive automation that reduces complexity, enabling teams to deliver services that meet reliability goals by default.

What You’ll Do

- Define and maintain SLIs/SLOs for critical services; apply error budgets to guide release decisions.

- Collaborate with development teams to embed resiliency patterns and reliability guardrails into architecture and code.

- Contribute to incident response for high severity events; support root cause analysis and post-incident improvements.

- Establish and evolve observability standards (logging, metrics, tracing) and build actionable dashboards and alerts.

- Drive performance and scalability improvements through load testing, capacity planning, and CI/CD performance gates.

- Automate operational tasks using Infrastructure-as-Code (Terraform/Helm), pipelines, and scripts to reduce toil.

- Advance AIOps capabilities for anomaly detection, smarter alerting, and faster remediation.

- Partner across teams to provide guidance, reviews, and golden paths for reliability by default.

Tech You’ll Use (Day to Day)

- Automation & Development: CI/CD pipelines (GitHub Actions / Azure DevOps), Infrastructure as Code (Terraform/Helm), scripting, and integration into dev workflows.

- Observability: Logging, metrics, tracing tools; Dynatrace, Kibana/Elastic, Prometheus, OpenTelemetry.

- Data & Messaging: Confluent Kafka, SAP HANA

- Performance Testing: Load and stress testing tools (e.g., JMeter, k6).

- Languages: TypeScript, Python, Bash, Java.

What You’ll Bring

- 6-10+ years in SRE, DevOps, or production operations for distributed systems.

- Proven experience with incident response and root cause analysis for high severity events.

- Strong skills in observability, performance engineering, and automation.

- Hands on expertise in Kubernetes cluster management and troubleshooting.

- Ability to model load, run stress tests, analyze bottlenecks, and plan capacity.

- Proficiency in CI/CD and Infrastructure as Code, with ability to influence development practices.

- Excellent collaboration and communication skills to partner with development and product teams.

Nice to Have

- Familiarity with AIOps concepts (AI‑driven anomaly detection, predictive alerting, automated remediation).

- Hands‑on experience with LLM Agents frameworks (e.g. LangGraph or similar) for automation or reliability tooling.

- Certifications in Kubernetes, SAP BTP, or Dynatrace.

- Experience with the manufacturing domain.

Education & Work Style

- Bachelor’s degree in computer science, Engineering, or equivalent experience.

- Curious, proactive, and data‑driven; comfortable mentoring and promoting best practices.

- Travel: Occasional (up to 0–10%) for team workshops or cross‑site collaboration.

- On‑call: Participation in a healthy rotation with continuous improvement focus.

Bring out your best

SAP innovations help more than four hundred thousand customers worldwide work together more efficiently and use business insight more effectively. Originally known for leadership in enterprise resource planning (ERP) software, SAP has evolved to become a market leader in end-to-end business application software and related services for database, analytics, intelligent technologies, and experience management. As a cloud company with two hundred million users and more than one hundred thousand employees worldwide, we are purpose‑driven and future‑focused, with a highly collaborative team ethic and commitment to personal development. Whether connecting global industries, people, or platforms, we help ensure every challenge gets the solution it deserves. At SAP, you can bring out your best.

We win with inclusion

SAP’s culture of inclusion, focus on health and well-being, and flexible working models help ensure that everyone – regardless of background – feels included and can run at their best. At SAP, we believe we are made stronger by the unique capabilities and qualities that each person brings to our company, and we invest in our employees to inspire confidence and help everyone realize their full potential. We ultimately believe in unleashing all talent and creating a better world.

SAP is committed to the values of Equal Employment Opportunity and provides accessibility accommodations to applicants with physical and/or mental disabilities. If you are interested in applying for employment with SAP and are in need of accommodation or special assistance to navigate our website or to complete your application, please send an e‑mail with your request to Recruiting Operations Team: Careers@sap.com.

For SAP employees: Only permanent roles are eligible for the SAP Employee Referral Program , according to the eligibility rules set in the SAP Referral Policy. Specific conditions may apply for roles in Vocational Training.

Qualified applicants will receive consideration for employment without regard to their age, race, religion, national origin, ethnicity, gender (including pregnancy, childbirth, et al), sexual orientation, gender identity or expression, protected veteran status, or disability, in compliance with applicable federal, state, and local legal requirements.

SAP believes the value of pay transparency contributes towards an honest and supportive culture and is a significant step towards demonstrating SAP’s commitment to pay equity. SAP provides the annualized compensation range inclusive of base salary and variable incentive target for the career level applicable to the posted role. The targeted combined range for this position is 102,400 - 214,300 (CAD) CAD. The actual amount to be offered to the successful candidate will be within that range, dependent upon the key aspects of each case which may include education, skills, experience, scope of the role, location, etc. as determined through the selection process. Any SAP variable incentive includes a targeted dollar amount, and any actual payout amount is dependent on company and personal performance. A summary of benefits and eligibility requirements can be found by clicking this link: www.SAPNorthAmericaBenefits.com.

Due to the nature of the role, which involves global interactions with SAP entities, as well as with employees and stakeholders in Canada, functional proficiency in English is required for positions based in the Quebec.

Please note that any violation of these guidelines may result in disqualification from the hiring process.

Requisition ID: 434468 | Work Area: Software-Design and Development | Expected Travel: 0 - 10% | Career Status: Professional | Employment Type: Regular Full Time | Additional Locations: #LI-Hybrid

Job Segment: Logistics, Embedded, Cloud, Testing, Supply Chain Manager, Operations, Technology

#J-18808-Ljbffr



  • London, Canada SAP SE Full time

    A leading software corporation in Southwestern Ontario seeks a Senior Site Reliability Engineering Specialist to ensure reliable operation of SAP Digital Manufacturing and SAP Logistics Management. You will collaborate with development teams to embed resiliency, handle incident responses, and enhance performance through automation. The role offers a hybrid...


  • London, Canada Affirm Full time

    Overview Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. Responsibilities Site Reliability Engineering at Affirm supports Engineering partners to “Operate What They Own” with excellence to protect the customer experience. SRE...


  • London, Canada Agnico Eagle Mines Limited Full time

    A leading mining company is seeking a Reliability Engineer to enhance plant asset reliability at Detour Lake Mine. The role involves overseeing condition monitoring and implementing reliability programs to optimize production efficiencies. Candidates should hold a Bachelor's or Master's in Mechanical Engineering and have experience with reliability programs....


  • London, Canada Affirm Full time

    OverviewAffirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest.ResponsibilitiesSite Reliability Engineering at Affirm supports Engineering partners to “Operate What They Own” with excellence to protect the customer experience. SRE defines...


  • London, Canada Iamgold Corporation Full time

    Job Description - Senior Specialist, Mill Maintenance (14729) Improve the processes of today Define the best practices of tomorrow At Côté Gold, we focus on what we contribute, not just what we extract. We believe that in order for prosperity to be sustained, it must be shared, it must support the well-being of our employees and our communities, and it...


  • London, Canada Cubiq Recruitment Full time

    2 days ago Be among the first 25 applicantsThis range is provided by Cubiq Recruitment. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Base pay rangeCA$120,000.00/yr - CA$150,000.00/yrDirect message the job poster from Cubiq RecruitmentHead of Delivery at Cubiq Recruitment LtdSenior Construction...


  • London, Canada Quantum World Technologies Inc. Full time

    Job Title: DB SRE – MSSQL & Azure (Techno-Managerial) Location: Toronto / Greater Ontario Area (Hybrid) Experience: 8–12 years Role summary Database SRE who is hands-on with MSSQL on Azure and can also front-end the customer, manage stakeholders, and coordinate closely with the offshore delivery team. Key responsibilities - Own MSSQL databases on...


  • London, Canada CARFAX Full time

    Description Join Team CARFAX as a Senior Security Engineer - Cloud Specialist We are seeking a highly skilled and motivated Senior Security Engineer - Cloud Specialist to join our dynamic Information Security team. The Senior Security Engineer - Cloud Specialist plays a vital role in safeguarding the organization's information assets by designing,...


  • London, Canada Callidus Engineering Full time

    OverviewWe are the Callidus Engineering team. We make buildings work!We offer a diverse, flexible workplace where you have a career, not just a job. Callidus Engineering is built on values of community, ambition and integrity. We value our team members. We are always striving to improve. We mean what we say.Do you have 10+ years of experience in Electrical...


  • London, Canada MT Talent Solutions Inc. Full time

    3 days ago Be among the first 25 applicants Senior Mechanical Engineer (P. Eng) We are a strategic recruitment agency specializing in connecting skilled professionals with top employers across Canada and the United States. Our holistic approach ensures that both technical and cultural alignment leads to long‑term success for all parties involved. We are...