Site Reliability Engineer

1 week ago


Montreal, Quebec, Canada SAP Full time

We help the world run better

Our company culture is focused on helping our employees enable innovation by building breakthroughs together. How? We focus every day on building the foundation for tomorrow and creating a workplace that embraces differences, values flexibility, and is aligned to our purpose-driven and future-focused work. We offer a highly collaborative, caring team environment with a strong focus on learning and development, recognition for your individual contributions, and a variety of benefit options for you to choose from. Apply now

Site Reliability Engineer

PURPOSE AND OBJECTIVES

The Reliability Engineering organization provides multitude of products and services related to operations and continuity of business delivery.


The Site Reliability Engineering teams make the SAP Business Technology Platform run better by providing 24x7 deep technical coverage for Incident Management (Outages and other incidents with major customer impact) applying SRE principles. We share a Live Site First culture and care for the business continuity of our customers running mission critical applications in the Cloud.


We are looking for an engineer to join an already established SRE team for the SAP Business Technology Platform.

EXPECTATIONS AND TASKS

As a Site Reliability Engineer, you will have the opportunity to operate and support business critical Cloud services. As part of your daily job, you will proactively monitor the service behavior and identify areas for improvement. You will participate in the development of tools for monitoring and troubleshooting cloud services built on latest open source and SAP technologies, following SRE principles.

Responsibilities

  • Act as technical expert during Live site incidents (downtimes of supported services in scope), investigate and solve incidents on a deep technical level.
  • Drive root cause analysis and follow-up improvements to prevent issues from reoccurring.
  • Perform in-depth troubleshooting and log analysis to identify and solve complex issues in accordance with internal and external SLAs.
  • Build software-based solutions to address improvements in service reliability and stability.
  • Enhance infrastructure and platform monitoring by gathering system metrics (4 Golden Signals) and implementing tools for recovery.
  • Integrate and collaborate closely with development teams and work with them on outputs from Postmortems and product improvements.
  • Learn new technologies and keep up to date with latest development increments.
  • Create and maintain technical documentation.
  • Define, advocate, apply SRE best practices.
  • Participate in the on-call rotation (follow the sun approach) to react to major incidents. On-call has a special compensation package.

If you are interested in software engineering based on cutting-edge technology, you will find an inspiring and professional environment for your learning and growth. You will be working in close collaboration with the development teams that build the services which are in our joint responsibility. We emphasize teamwork and a trust-based working model. Collaboration with other teams in an international environment will be a regular part of your work.

EDUCATION AND QUALIFICATIONS / SKILLS AND COMPETENCIES

Required Skills and Competencies

  • BSc degree in Computer Science or related technical field.
  • Experience with Kubernetes and good understanding of container technologies.
  • Understanding of modern cloud architectures (experience with Cloud Platforms such as AWS, Azure, GCP are a plus).
  • Scripting skills, CI/CD (Jenkins and ArgoCD are a plus) - enthusiasm for automation - make the computers do the work for you.
  • Working efficiently in emergency situations. Affinity to quickly analyze and solve problems in a global team setup.
  • Excellent team player, passionate about his/her work, self-motivated and driven.
  • Excellent communication skills - precise, based on facts.
  • Fluency in English, basic French

Preferred Additional Skills and Competencies

  • Coding experience with Go, Terraform, Python
  • CKA/CKAD/CKS certifications
  • Experience with Unix/Linux operating system
  • Experience with modern monitoring, logging, and alerting tools (Grafana, Prometheus, Kibana, Loki, Splunk On-Call, Dynatrace)
  • Security best practices for application development and operations in a public Cloud Environment
  • Contribution to open-source projects

WORK EXPERIENCE

If you are interested in this position and would like to join our team, please apply even if you don't meet all the qualifications listed in the job posting. You may be offered a position according to your current working experience and expertise.

We build breakthroughs together

SAP innovations help more than 400,000 customers worldwide work together more efficiently and use business insight more effectively. Originally known for leadership in enterprise resource planning (ERP) software, SAP has evolved to become a market leader in end-to-end business application software and related services for database, analytics, intelligent technologies, and experience management. As a cloud company with 200 million users and more than 100,000 employees worldwide, we are purpose-driven and future-focused, with a highly collaborative team ethic and commitment to personal development. Whether connecting global industries, people, or platforms, we help ensure every challenge gets the solution it deserves. At SAP, we build breakthroughs, together.

We win with inclusion

SAP's culture of inclusion, focus on health and well-being, and flexible working models help ensure that everyone – regardless of background – feels included and can run at their best. At SAP, we believe we are made stronger by the unique capabilities and qualities that each person brings to our company, and we invest in our employees to inspire confidence and help everyone realize their full potential. We ultimately believe in unleashing all talent and creating a better and more equitable world.
SAP is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to the values of Equal Employment Opportunity and provide accessibility accommodations to applicants with physical and/or mental disabilities. If you are interested in applying for employment with SAP and are in need of accommodation or special assistance to navigate our website or to complete your application, please send an e-mail with your request to Recruiting Operations Team:

For SAP employees: Only permanent roles are eligible for the SAP Employee Referral Program, according to the eligibility rules set in the SAP Referral Policy. Specific conditions may apply for roles in Vocational Training.

EOE AA M/F/Vet/Disability:

Qualified applicants will receive consideration for employment without regard to their age, race, religion, national origin, ethnicity, age, gender (including pregnancy, childbirth, et al), sexual orientation, gender identity or expression, protected veteran status, or disability.

SAP believes the value of pay transparency contributes towards an honest and supportive culture and is a significant step toward demonstrating SAP's commitment to pay equity. SAP provides the annualized compensation range inclusive of base salary and variable incentive target for the career level applicable to the posted role. The targeted combined range for this position is 71, ,800 Canadian CAD. The actual amount to be offered to the successful candidate will be within that range, dependent upon the key aspects of each case which may include education, skills, experience, scope of the role, location, etc. as determined through the selection process. Any SAP variable incentive includes a targeted dollar amount, and any actual payout amount is dependent on company and personal performance. Please reference this link for a summary of SAP benefits and eligibility requirements: ​

Requisition ID: | Work Area: Software-Development Operations | Expected Travel: 0 - 10% | Career Status: Professional | Employment Type: Regular Full Time | Additional Locations: #LI-Hybrid



  • Montreal, Quebec, Canada Cisco Systems, Inc. Full time

    Site Reliability Engineering - Technical Leader Location: Alternate Location Area of Interest Compensation Range CAD CAD Job Type Professional Cloud and Data Center, Software Development Job Id Who We Are As a part of Cisco, Accedian is a leader in per


  • Montreal, Quebec, Canada Noverka Conseil Full time

    At Noverka, our values illustrate who we are and define our beliefs: Human, Transparent, Passionate. We are driven by innovation and success, both in our relationships and in our practices.Finding the right job for the right person is what we do bestOur client, an organization in the banking industry is looking for a Site Reliability Engineering (SRE)...


  • Montreal, Quebec, Canada Lyft Full time

    At Lyft, our mission is to enhance people's lives with top-notch transportation services. We strive to foster an inclusive and diverse environment in our community, valuing the unique contributions of each team member. Our goal is to revolutionize the way the world approaches transportation, envisioning a future where cities feel more connected and...


  • Montreal, Quebec, Canada Cisco Full time

    ```htmlWho We AreAs a part of Cisco, Accedian is a leader in performance analytics and end user experience solutions for service providers and mid-to-large size enterprises. The Accedian Skylight service assurance platform offers granular end-to-end visibility within "the massive multi" - multi-layer, multi-domain, and multi-vendor networks. Accedian's open...


  • Montreal, Quebec, Canada Socotra, Inc. Full time

    At Lyft, our mission is to improve people's lives with the world's best transportation. Imagine cities where streets are safe, communities thrive, and personal cars are a thing of the past. We envision a future where shared and active transportation modes are the norm, fostering vibrant, connected neighborhoods. As a leader in micromobility, Lyft powers...


  • Montreal, Quebec, Canada Lightspeed Full time

    Hi there Thanks for stopping by Are you actively looking for a new opportunity? Or just checking the market? Well... you might just be in the right place We're looking for a Principal Site Reliability Engineer to join our NuOrder by Lightspeed team in North America. NuORDER by Lightspeed builds software solutions that help merchants grow the size and...


  • Montreal, Quebec, Canada Lightspeed Full time

    Welcome to NuOrder by LightspeedAre you actively looking for a new opportunity? Or just checking the market? Well... you might just be in the right place We're looking for a Principal Site Reliability Engineer to join our NuOrder by Lightspeed team in North America.NuORDER by Lightspeed builds software solutions that help merchants grow the size and the...


  • Montreal, Quebec, Canada CGI Full time

    Position Description:CGI is a dynamic and innovative technology firm committed to delivering cutting-edge solutions. We are currently seeking a highly skilled and motivated individual to join our team as a FinOps and Site Reliability Engineer (SRE). This role is pivotal in bridging our finance and technology teams to ensure the successful implementation and...


  • Montreal, Quebec, Canada Cisco Systems, Inc. Full time

    Cloud and Data Center, Software Development As a part of Cisco, Accedian is a leader in performance analytics and end user experience solutions for service providers and mid-to-large size enterprises. The Accedian Skylight service assurance platform offers granular end-to-end visibility within "the massive multi" - multi-layer, multi-domain, and...


  • Montreal, Quebec, Canada Lightspeed Full time

    Hi there Thanks for stopping by Are you actively looking for a new opportunity? Or just checking the market? Well... you might just be in the right place We're looking for a Principal Site Reliability Engineer to join our NuOrder by Lightspeed team in North America. NuORDER by Lightspeed builds software solutions that help merchants grow the size and the...


  • Montreal, Quebec, Canada Behavox Full time

    About BehavoxBehavox is shaping the future for how businesses harness their most important raw material - data. Our mission is bold: Organize enterprise data into actionable information that protects and promotes the business growth of multinational companies around the world.From managing enterprise risk and compliance to maximizing revenue and value, our...


  • Montreal, Quebec, Canada Behavox Full time

    About BehavoxBehavox is shaping the future for how businesses harness their most important raw material - data. Our mission is bold: Organize enterprise data into actionable information that protects and promotes the business growth of multinational companies around the world.From managing enterprise risk and compliance to maximizing revenue and value, our...


  • Montreal, Quebec, Canada TMX Full time

    Venture outside the ordinary - TMX Careers The TMX group of companies includes leading global exchanges such as the Toronto Stock Exchange, Montreal Exchange, and numerous innovative organizations enhancing capital markets. United as a global team, we're connecting cross-functionally, traversing industries and geographies, moving opportunity into action,...


  • Montreal, Quebec, Canada Genpact Full time

    SRE Engineer Montreal Canada, Fulltime, Onsite role Responsibilities Overall, experience in IT Infrastructure Management Services, Service Delivery Management, IT Operation Management, Project Management, Multi Cloud Delivery Management, Transitioning IT services and Account Management Work


  • Montreal, Quebec, Canada LanceSoft, Inc. Full time

    Job Title: Production Reliability & Support Expert (SRE)Location : Montreal ( Office attendance from Day 1 – Hybrid mode 3x per week)Years of experience : 3 to 5 years Ensure Production Management is closely aligned/embedded in the Agile software development process and our code meets prod


  • Montreal, Quebec, Canada National Bank Full time

    As a Site Reliability Specialist, Business Intelligence and Data Management, you will play a key role within a DevOps squad that is working to innovate, develop new ways of integrating data into our assets and maintain the availability and reliability of our assets in production. You will be tasked


  • Montreal, Quebec, Canada TMX Full time

    Through a rich exchange of ideas, meaningful collaboration, and a nimble operating model, we're powering some of the nation's most critical systems, fueling capital formation and innovation, bringing increased opportunity to business visionaries, product ingenuity to consumers, and career exploration to our team. Global Technology Services (GTS) GTS is one...


  • Montreal, Quebec, Canada National Bank Full time

    As a Site Reliability Specialist, Business Intelligence and Data Management, you will play a key role within a DevOps squad that is working to innovate, develop new ways of integrating data into our assets and maintain the availability and reliability of our assets in production. You will be tasked with helping clients and consumers more easily use the data...


  • Montreal, Quebec, Canada Hunter Bond Full time $125,000

    Job Title:Application Support Engineer Client: Fintech My client are looking to expand their Application Support team, and would like someone with prior front office experience to provide technical support and engineering functions in support of their proprietary and third party trading systems. Automate software configuration and work towards flawless...


  • Montreal, Quebec, Canada Banque Nationale du Canada Full time

    Site Reliability Engineering Developper SRE Hybrid Job Number 21241 Category Senior Professional Status: Permanent Type of Contract Permanent Schedule: Full-Time Full Time / Part Time? Full-Time 06-Jun-2024 City Montreal Province/State Area of Interest: Information technology A career in technology at National Bank means participating...