Systems Reliability Engineer

4 weeks ago


Toronto, Ontario, Canada Scotiabank Full time

As a key member of our team at Scotiabank, you will play a critical role in ensuring the reliability and performance of our production systems.

Key Responsibilities:
  • Contribute to in-depth data analysis to gauge service trends and drive improvements to production systems.
  • Collaborate closely with SREs, Development, and Operations teams to assist in troubleshooting of severe incidents, contributing to communication, driving problem-solving, and debugging with best practices.
  • Support Senior Manager and Director in proactive communication of reliability, stability, and efficiency results, service health, key reliability risks, and issues to senior business and technology stakeholders.
  • Support and handle production incident escalations during business and off-hours.
  • Participate in Root Cause Analysis for systems in scope to identify gaps and drive implementation of improvements.
  • Lead investigations for production issues and assist in developing solutions that follow sound software engineering and security principles defined by the organization.
  • Interface with Scotiabank infrastructure and engineering teams to address improvement of stability and resiliency for production applications.
  • Analyze metrics, identify trends, and execute processes and strategies to address any risks to reliability.
Requirements:
  • Minimum of 5 years of experience supporting technology in an operational role.
  • Minimum of 5 years of experience working with enterprise delivery methodologies such as ITIL, Agile, and Waterfall.
  • Expert knowledge of incident and problem management methods and methodologies in a production environment and using platforms such as ServiceNow to manage incidents and problems.
  • Ability to analyze and present data using various tools and techniques in business analysis and business intelligence principles to create executive dashboards and reports.
  • Experience with operational monitoring and performance management tools such as Dynatrace, Aternity, ARMS, Tivoli, and other similar technologies.
  • Experience with Splunk, eVIEW, or other software for searching, monitoring, and examining machine-generated Big Data.
  • Knowledge and understanding of SRE Service Level Objectives Google best practices.
  • Collaborate with other teams for the consultation, design, implementation, and management of monitoring tools, dashboards, and reports.
  • Excellent verbal and written communication skills, as well as strong problem-solving skills coupled with the ability to collaborate with development teams and business partners.
What We Offer:
  • Diversity, Equity, Inclusion & Allyship - We strive to create an inclusive culture where every employee is empowered to reach their fullest potential, respected for who they are, and are embraced through bias-free practices and inclusive values across Scotiabank.
  • Accessibility and Workplace Accommodations - We value the unique skills and experiences each individual brings to the Bank and are committed to creating and maintaining an inclusive and accessible environment for everyone.
  • Upskilling through online courses, cross-functional development opportunities, and tuition assistance.
  • Competitive Rewards program including bonus, flexible vacation, personal, sick days, and benefits will start on day one.
  • Community Engagement - no matter where you choose to work from; we offer opportunities for community engagement & belonging with our various programs such as hackathons, contests, cooking with friends, Humans of Digital, and much more.

Work arrangements: Hybrid

#LI-Hybrid



  • Old Toronto, Ontario, Canada Chelsea Avondale Full time

    Job Title: Asset Reliability EngineerAt Chelsea Avondale, we're pushing the boundaries of home insurance innovation. Our team of experts has developed cutting-edge risk modeling and insurance pricing technologies, which we deploy through our own insurance company.We're a group of talented individuals from diverse backgrounds, including insurance, software...


  • Toronto, Ontario, Canada Lorven Technologies Full time

    Job Title : Reliability Systems Engineer Location : Remote Duration : Long term A Bachelor's degree in Computer Science or related technical field, or equivalent practical experience. Advanced knowledge of SRE practices and technologies including Azure, Linux, and scripting languages. Expertise in various SRE tools such as Ansible, Azure Automation,...

  • Reliability Engineer

    4 weeks ago


    Toronto, Ontario, Canada Scotiabank Full time

    About the Role:We are seeking a highly skilled Systems Reliability Engineer to join our team at Scotiabank. As a key member of our Systems Reliability Office, you will be responsible for ensuring the stability and reliability of our technology portfolio.Key Responsibilities:Champion a customer-focused culture to deepen client relationships and leverage...


  • Toronto, Ontario, Canada Vantage Full time

    About the Role:We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems. Your expertise in software and systems engineering will be instrumental in building, maintaining, and...


  • Toronto, Ontario, Canada Interac Corp. Full time

    Senior System Reliability EngineerWe are seeking a skilled Senior System Reliability Engineer to join our team at Interac Corp. in Canada.About the Role:This is an exciting opportunity to work on high-performance payment systems, focusing on Site (Application) Reliability Engineering activities, including proactive monitoring, responding to alerts and...


  • Toronto, Ontario, Canada Safran Landing Systems Full time

    Job Description Assist in the development and certification of the landing gear system, including hydro-mechanical, electrical, and control systems designed per software and complex hardware (DO-178/DO-254). Liaise with customers and airworthiness authorities on matters pertaining to certification and system development. Define requirements applicable to the...

  • Reliability Engineer

    4 weeks ago


    Toronto, Ontario, Canada Scotiabank Full time

    About the Role:We are seeking a highly skilled System Reliability Engineer to join our team at Scotiabank. As a key member of our Systems Reliability Office, you will play a critical role in ensuring the stability and reliability of our technology portfolio.Key Responsibilities:Champion a customer-focused culture to deepen client relationships and leverage...


  • Toronto, Ontario, Canada Scotiabank Full time

    About the RoleAs a key member of the Systems Reliability Office, you will work collaboratively with various teams to deliver high-quality results. Your contributions will directly impact the success of our stakeholders.Key ResponsibilitiesContribute to cross-functional teams to drive significant deliverables.Work closely with stakeholders to understand and...


  • Toronto, Ontario, Canada Safran Landing Systems Full time

    Job DescriptionAs a Senior Systems Engineer at Safran Landing Systems, you will play a key role in the development and certification of the Landing Gear System. This includes working on hydro-mechanical, electrical, and control systems designed per Software and Complex Hardware (DO-178/DO-254). You will liaise with customers and airworthiness authorities on...


  • Toronto, Ontario, Canada The Toronto-Dominion Bank (Canada) Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at The Toronto-Dominion Bank (Canada). As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and applications.Key ResponsibilitiesProvide technical leadership and expertise in designing and...


  • Toronto, Ontario, Canada Criteo Full time

    About the Role:This is a challenging opportunity for an experienced engineer to join Criteo's PRE team as a Site Reliability Engineer. The role involves working closely with product engineering to improve the reliability of our apps, systems, and pipelines, assessing where optimization is needed most, and telling stories with meaningful monitoring.Key...


  • Toronto, Ontario, Canada Lorven Technologies Full time

    Job Title : Reliability Systems SpecialistLocation : RemoteDuration : Long termA Bachelor's degree in Computer Science or related technical field, or equivalent practical experience.Advanced knowledge of reliability engineering practices and technologies.Hands-on experience in reliability tools (Ansible, Azure Automation, Catchpoint).Azure, Linux.Dynatrace,...


  • Toronto, Ontario, Canada Flinks Full time

    About FlinksFlinks is a leading provider of data infrastructure solutions for the financial industry. Our mission is to empower consumers with control over their financial data and unlock its full potential. We equip fintechs and banks with cutting-edge data tools, enabling them to create innovative, client-centric products that are transforming the...


  • Toronto, Ontario, Canada Flinks Full time

    About Flinks: A Pioneering Force in Financial Data ManagementFlinks is at the forefront of open banking and financial data management, empowering consumers to take control of their financial lives. Our mission is to unlock the full potential of financial data, enabling innovative solutions for fintechs and banks.As a leading provider of data infrastructure,...


  • Toronto, Ontario, Canada Criteo Full time

    About the Role:Criteo is seeking a talented Site Reliability Engineer to join our PRE team.What You'll Do: As a Site Reliability Engineer, you'll work closely with product engineering to improve the reliability of our apps, systems, and pipelines. You'll assess where optimization is needed most and tell stories with meaningful monitoring.How You'll Make an...


  • Toronto, Ontario, Canada Scotiabank Full time

    About the Role:We are seeking a highly skilled System Reliability Engineer to join our team at Scotiabank. As a key member of our Systems Reliability Office, you will be responsible for ensuring the stability and reliability of our technology portfolio.Key Responsibilities:Champion a customer-focused culture to deepen client relationships and leverage...


  • Toronto, Ontario, Canada Metrolinx Full time

    Job Title: Senior Reliability EngineerJob Summary:Metrolinx is a leading transportation agency in the Greater Golden Horseshoe region, operating GO Transit, UP Express, and the PRESTO fare payment system. We are committed to providing reliable and efficient transportation services to our customers. As a Senior Reliability Engineer, you will play a critical...


  • Toronto, Ontario, Canada Criteo Full time

    About the Role:We are seeking a skilled Senior Site Reliability Engineer to join our team at Criteo. As a key member of our Product Reliability Engineering group, you will work closely with product engineering to improve the reliability of our apps, systems, and pipelines.Your Responsibilities:Collaborate with product engineering to identify and prioritize...


  • Toronto, Ontario, Canada Criteo Full time

    Company Overview:Criteo is a leader in the AdTech industry, pushing the boundaries of online advertising and driving innovation. As a Site Reliability Engineer on our team, you will be at the forefront of building and maintaining scalable systems that deliver exceptional results.About the Role:This role offers a unique blend of technical expertise and...


  • Toronto, Ontario, Canada The Home Depot Canada Full time

    About The Home Depot CanadaThe Home Depot Canada is a leading retailer of home improvement products and services, committed to delivering exceptional customer experiences and driving business growth. We are seeking a highly skilled Cloud Reliability Engineering Manager to join our team and lead our Site Reliability Engineers in ensuring the reliability,...