System Reliability Engineer

2 months ago


Old Toronto, Canada Scotiabank Full time

Requisition ID: 207317

Join a purpose-driven winning team, committed to results, in an inclusive and high-performing culture.

The Team:

You will work cross-functionally amongst a variety of teams and be a contributor in all significant deliverables to the Systems Reliability Office stakeholders. You will also have an understanding of ‘what could go wrong’, solve complex problems and have a flair for communicating and leading discussions with technical and business partners. You will leverage your deep experience with IT Service Delivery and IT Service Management to standardize and improve operations, analysis, and service levels across the Canadian Banking portfolio. You will contribute to the overall success of the Canadian Banking Stability team, ensuring specific individual and team goals, plans, initiatives are executed/delivered in support of the Stability team’s strategies and objectives. You will ensure that all activities conducted are in compliance with governing regulations, internal policies, and procedures.

Is this role right for you? In this role you will:

  1. Champion a customer-focused culture to deepen client relationships and leverage broader Bank relationships, systems, and knowledge.
  2. Be accountable for creating, maintaining, and distributing SLOs data and reports/dashboards of our technology portfolio to various stakeholders across the organization.
  3. Champion Stability and Reliability across a portfolio of applications and services by working closely with service owner teams to continuously improve the MTTR metrics and reduce downtime, leading troubleshooting of our most severe incidents and participating in incident root cause analysis to prevent recurrence.
  4. Contribute to prioritization of reliability features with service owners and engineering teams.
  5. Contribute to the design, development, and delivery of effective tooling, alerts, and automated responses to identify and address reliability risks and automation of SLOs.
  6. Participate in incident calls and when required lead communications on impact and recovery status.
  7. Ensure information on incidents and problems is complete, accurate and that the action items are being worked on by the assigned individuals.
  8. Produce weekly/monthly/quarterly status reporting or dashboards on incidents and problems for distribution to business and technology stakeholders.
  9. Participate and drive post-incident activities as per organizational governance and requirements.
  10. Interface with teams across technology and business partners on stability and reliability concerns, system disruptions, and providing incident details and root causes.

Do you have the skills that will enable you to succeed in this role? We'd love to work with you if you have:

  1. A degree in Computer Science, Engineering, Business Management/Commerce or equivalent experience.
  2. 8+ years experience in the industry (Software development, DevOps, Service Management) with at least 3 years in a leadership capacity.
  3. Experience with creating and maintaining system performance dashboards.
  4. Experience with analyzing and troubleshooting systems.
  5. Excellent communication (both verbal and written). The ability to communicate confidently and clearly on conference calls, in meetings, via email, etc. at all levels of the organization is essential.
  6. Ability to quickly and clearly communicate incident status via email in business-friendly language.
  7. Experience with ITSM tools (ServiceNow, a plus) with strong understanding of SRE and service management principles.

Nice to Haves:

  1. Experience or familiarity with the Financial industry.
  2. Systematic problem-solving approach, coupled with effective communication skills and a sense of drive.
  3. Experience with Performance and Capacity Management (PCM) tools (e.g., Dynatrace, Splunk).
  4. Well-rounded broad knowledge of OS platforms (Linux/UNIX), Networking, Web Systems, and IT Ops.
  5. ITIL Foundation Certification.

What's in it for you?

  • Diversity, Equity, Inclusion & Allyship - We strive to create an inclusive culture where every employee is empowered to reach their fullest potential, respected for who they are, and are embraced through bias-free practices and inclusive values across Scotiabank.
  • Accessibility and Workplace Accommodations - We value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone.
  • Upskilling through online courses, cross-functional development opportunities, and tuition assistance.
  • Competitive Rewards program including bonus, flexible vacation, personal, sick days and benefits will start on day one.
  • Community Engagement - no matter where you choose to work from; we offer opportunities for community engagement & belonging with our various programs such as hackathons, contests, cooking with friends, Humans of Digital and much more
  • Work arrangements: Hybrid.

Location(s): Canada: Ontario: Toronto

Scotiabank is a leading bank in the Americas. Guided by our purpose: "for every future", we help our customers, their families, and their communities achieve success through a broad range of advice, products, and services.

#J-18808-Ljbffr
  • Reliability Engineer

    4 weeks ago


    Old Toronto, Canada Thomson Reuters Full time

    About the RoleWe are seeking a skilled Reliability Engineer - Cloud Systems to join our team at Thomson Reuters.As a Reliability Engineer - Cloud Systems, you will be responsible for analyzing and resolving chronic and major issues affecting our cloud-based services.Key responsibilities include:Designing and implementing scalable systems and...


  • Old Toronto, Canada Scotiabank Full time

    Requisition ID: 207317Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.The Team:You will work cross-functionally amongst a variety of teams and be a contributor in all significant deliverables to the Systems Reliability Office stakeholders. You will also have an understanding ‘what could go wrong’,...


  • Old Toronto, Canada Lorien Full time

    Hybrid - Manchester We are currently working with a leading gambling company dedicated to providing exceptional gaming experiences. They are looking for an experienced Site Reliability Engineer with a strong skill set in system reliability to join its world-class technology team. This role is ideal for someone who has 4+ years of experience within the...


  • Old Toronto, Canada System One Full time

    System Engineer Requirements / Configuration ManagerAccountabilities:Primary responsible for the requirements management on the project and assuring accuracy of the information within the requirements management tool.Lead scope/requirement discussions with technical teams and customer.Lead System Requirements effort and maintain the Requirements Traceability...


  • Old Toronto, Canada Chelsea Avondale Full time

    Chelsea Avondale is the world’s most cutting-edge home insurance group. We have developed sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company. Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our group...


  • Toronto, Ontario, Canada Lorven Technologies Full time

    Job Title : Reliability Systems Engineer Location : Remote Duration : Long term A Bachelor's degree in Computer Science or related technical field, or equivalent practical experience. Advanced knowledge of SRE practices and technologies including Azure, Linux, and scripting languages. Expertise in various SRE tools such as Ansible, Azure Automation,...


  • Toronto, Ontario, Canada Scotiabank Full time

    As a key member of our team at Scotiabank, you will play a critical role in ensuring the reliability and performance of our production systems.Key Responsibilities:Contribute to in-depth data analysis to gauge service trends and drive improvements to production systems.Collaborate closely with SREs, Development, and Operations teams to assist in...


  • Old Toronto, Canada Lorien Full time

    p>Hybrid - ManchesterWe are currently working with a leading gambling company dedicated to providing exceptional gaming experiences. They are looking for an experienced Site Reliability Engineer with a strong skill set in system reliability to join its world-class technology team. This role is ideal for someone who has 4+ years of experience within the...


  • Old Toronto, Canada Ascend Fundraising Solutions Full time

    We are seeking a skilled Cloud Reliability Engineer to collaborate with our IT team in Toronto. In this role, you will work closely with the client services team to diagnose, troubleshoot, and resolve system reliability issues.Responsibilities:Take ownership of customer-reported issues and drive them to resolution.Develop proactive measures to prevent...


  • Old Toronto, Canada Robinhood Full time

    About the RoleWe're seeking a skilled Software Developer to join our Reliability Engineering team at Robinhood. As part of this team, you'll play a crucial role in designing, evolving, and maintaining large-scale distributed systems.The team is focused on building robust, scalable systems that ensure high availability and low latency. Our primary areas of...


  • Old Toronto, Canada Scotiabank Full time

    About the RoleThe Senior Service Reliability Manager will lead and collaborate with a team to continuously improve the stability and reliability of PCBE systems through Site Reliability Engineering (SRE) based practices.This includes continuous people, process, and technology enhancements in support of our rapidly changing technology product portfolio.You...


  • Old Toronto, Canada Aversan Inc Full time

    Hardware Design Reliability Engineer North York, Ontario Position Summary Responsible for the hardware reliability activities regarding the hardware products within Engineering perimeter. Essential Functions / Key Areas of Responsibility Monitor the hardware reliability of the hardware systems in the field. Maintain a table with all the hardware returns...


  • Old Toronto, Canada Street Context Full time

    Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale ? Do you command with a steady hand when incidents unfold? Are you motivated by team success ? If so, continue reading… We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street...


  • Toronto, Ontario, Canada Vantage Full time

    About the Role:We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems. Your expertise in software and systems engineering will be instrumental in building, maintaining, and...


  • Old Toronto, Canada The Home Depot Canada Full time

    About The JobAs a Cloud Reliability Engineer Lead at The Home Depot Canada, you will play a crucial role in ensuring the reliability, performance, and operational support of our eCommerce systems.Job OverviewThis position requires a strong background in reliability reviews, performance engineering practices, production engineering, and operational support,...


  • Toronto, Ontario, Canada Interac Corp. Full time

    Senior System Reliability EngineerWe are seeking a skilled Senior System Reliability Engineer to join our team at Interac Corp. in Canada.About the Role:This is an exciting opportunity to work on high-performance payment systems, focusing on Site (Application) Reliability Engineering activities, including proactive monitoring, responding to alerts and...


  • Toronto, Canada Scotiabank Full time

       Requisition ID: 207317Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture. The Team: You will work cross-functionally amongst a variety of teams and be a contributor in all significant deliverables to the Systems Reliability Office stakeholders.  You will also have an understanding ‘what could go...


  • Old Toronto, Canada Street Context Full time

    p>Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale? p>We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street Context. We provide a premium Email, Analytics and Broker Relationship platform, purpose-built for capital markets and...


  • Old Toronto, Canada Street Context Full time

    p>Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale? p>We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street Context. We provide a premium Email, Analytics and Broker Relationship platform, purpose-built for capital markets and...


  • Old Toronto, Canada Tecsys Full time

    Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...