Site Reliability Engineer

2 months ago


Old Toronto, Canada TD Bank Full time

Site Reliability Engineer

Site Reliability Engineer

Work Location: Canada

Hours: 37.5

Line of Business: Technology Solutions

Pay Details: We’re committed to providing fair and equitable compensation to all our colleagues. As a candidate, we encourage you to have an open dialogue with a member of our HR Team and ask compensation related questions, including pay details for this role.

Job Description:

CUSTOMER

  • Provide technical leadership to improve the design and operation of systems in alignment to reliability engineering best practices and overall Technology and Bank strategies, applying the practices of computer science and software engineering to the design and development of large, complex systems.
  • Drive and influence integrated DevOps solutions across business, product, platform, infrastructure, development, support/DevOps teams that improve the design and operation of systems, making them scalable, reliable, and efficient while ensuring performance and high availability of products/services.
  • Ensure availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of products/service(s) including enterprise systems that may serve multiple services and applications/segments.
  • Influence and partner with key technology and product team members in the design and development of solutions that promote automation and the elimination of toil; identify optimal ways to improve the design and operation of systems to make them more scalable, more reliable, and more efficient and have the ability to implement the required changes.
  • Define and prioritize problems to solve with applications/products/services and respective systems and drive the resolution/remediation with technology teams across design, implementation, and support.
  • Develop deep relationships with Product Owners, Tech Leads and Ops to build transparency and help foster end to end accountability of products and services.
  • Work in close partnership with technology teams to support TD's business objectives and operational support goals providing domain expertise on strategic Infrastructure as well as Business project related activities.
  • Review technical deliverables throughout the design and development phase to ensure systems adhere to SRE best practices.

SHAREHOLDER

  • Ensure adherence of Operational (Production) Readiness practices of respective products and services.
  • Set service-level objectives (SLO) that defines availability of a particular product or service and exercise key decision rights of the SRE role (e.g. supporting release to production, rejecting software that is operationally substandard and directing developers to improve the code etc.).
  • Implement the observability requirements to monitor and assure that our systems measure to the expected service levels and perform with the appropriate operational characteristics.
  • Focus on reliability, scalability, and the development of the production computing infrastructure, including highly complex and scalable systems.
  • Develop observability standards to ensure that production systems operate under known conditions and transparently provides these measurements to anticipate when errors or failures can arise.
  • Engineer solutions through problem post-mortem reviews to ensure that problem solutions are complete and that errors will not manifest again.
  • Anticipate internal and external business challenges, helping teams find solutions through continuously improving on process and technologies.
  • Lead interaction with governance and control groups, (e.g. regulatory/operational risk, compliance and audit) to provide subject matter expertise and consult on risk issues related to Engineering technology and tools.
  • Lead or contribute to cross-functional/enterprise initiatives as an organizational or subject matter expert helping to identify risk/provide guidance for significant and complex situations.
  • Proactively identify emerging technologies and innovative solutions for building more robust platform domains; keep abreast of emerging issues, trends, and evolving regulatory requirements and assess potential impacts.
  • Protect the interests of the organization – identify and manage risks, and escalate non-standard, high-risk transactions/activities as necessary.
  • Maintain a culture of risk management and control, supported by effective processes in alignment with risk appetite.

EMPLOYEE / TEAM

  • Participate fully as a member of the team, support a positive work environment that promotes service to the business, quality, innovation, and teamwork and ensure timely communication of issues/points of interest.
  • Support the team by continuously enhancing knowledge/expertise in own area and participate in knowledge transfer within the team and business unit.
  • Keep current on emerging trends/developments and grow knowledge of the business, related tools, and techniques.
  • Participate in personal performance management and development activities, including cross-training within own team.
  • Keep others informed and up to date about the status/progress of projects and/or all relevant or useful information related to day-to-day activities.
  • Contribute to the success of the team by willingly assisting others in the completion and performance of work activities; provide training, coaching and/or guidance as appropriate.
  • Contribute to a fair, positive and equitable environment that supports a diverse workforce.
  • Act as a brand ambassador for your business area/function and the bank, both internally and/or externally.

BREADTH & DEPTH:

  • Expert Site Reliability Engineering role with comprehensive expertise in leading-edge theories, engineering practices, extensive coding and scripting.
  • Advanced and highly specialized knowledge of applications, systems, networks, innovation models, design activities, best practices, business/organization, Bank standards, and may fulfill a governance role.
  • Engineering specialist assigned to work autonomously on high profile, complex and/or high-risk technology initiatives with significant impact to the organization.
  • Provides technical leadership/consulting/direction to multiple businesses and product teams, growing capability across the organization.
  • Resolves unique and complex problems that have a broad impact on the business.
  • Authoritative expert on site reliability issues within area of specialization.
  • Understands the journey of an enterprise transformation where there is a hybrid cloud/non-cloud operating model.
  • Drives end/end accountability of products and services across the enterprise through collaboration and transparency.
  • Primarily works at the product umbrella, segment, LOB or Product Family level.
  • Typically reports to the Site Reliability Practice Area Lead.

EXPERIENCE AND / OR EDUCATION

  • University degree in Computer Science or related technical field involving systems engineering or equivalent practical experience.
  • 10+ years of engineering experience (e.g. Software or platform).

Who We Are:

TD is one of the world's leading global financial institutions and is the fifth largest bank in North America by branches/stores. Every day, we deliver legendary customer experiences to over 27 million households and businesses in Canada, the United States and around the world. More than 95,000 TD colleagues bring their skills, talent, and creativity to the Bank, those we serve, and the economies we support. We are guided by our vision to Be the Better Bank and our purpose to enrich the lives of our customers, communities and colleagues.

Our Total Rewards Package:
Our Total Rewards package reflects the investments we make in our colleagues to help them and their families achieve their financial, physical, and mental well-being goals. Total Rewards at TD includes a base salary, variable compensation, and several other key plans such as health and well-being benefits, savings and retirement programs, paid time off, banking benefits and discounts, career development, and reward and recognition programs.

Additional Information:
We’re delighted that you’re considering building a career with TD. Through regular development conversations, training programs, and a competitive benefits plan, we’re committed to providing the support our colleagues need to thrive both at work and at home.

Colleague Development:
If you’re interested in a specific career path or are looking to build certain skills, we want to help you succeed. You’ll have regular career, development, and performance conversations with your manager, as well as access to an online learning platform and a variety of mentoring programs to help you unlock future opportunities.

Training & Onboarding:
We will provide training and onboarding sessions to ensure that you’ve got everything you need to succeed in your new role.

Interview Process:
We’ll reach out to candidates of interest to schedule an interview. We do our best to communicate outcomes to all applicants by email or phone call.

Accommodation:
Your accessibility is important to us. Please let us know if you’d like accommodations (including accessible meeting rooms, captioning for virtual interviews, etc.) to help us remove barriers so that you can participate throughout the interview process.

Language Requirement: N/A.

Our Values:
At TD we’re guided by our purpose to enrich the lives of our customers, communities and colleagues, and share a set of values that shape our culture and guide our behavior.

Our Commitment to Diversity, Equity, and Inclusion:
At TD, we’re committed to fostering an environment where all colleagues are encouraged to bring their authentic selves to work, experience equitable opportunities, and feel respected and supported.

Helping to Make an Impact in Communities – TD Ready Commitment:
TD has a long-standing commitment to help drive progress towards a more inclusive and sustainable future.

#J-18808-Ljbffr

  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada Reperio Human Capital Full time

    ```html Site Reliability Engineer 100421 Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience with monitoring and automation tools. Responsibilities: Ensure the reliability,...


  • Old Toronto, Canada Reperio Human Capital Full time

    ```html Site Reliability Engineer 100421 Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience with monitoring and automation tools. Responsibilities: Ensure the reliability,...


  • Old Toronto, Ontario, CA CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and Confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and Confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Ontario, CA Reperio Human Capital Full time

    ```html Site Reliability Engineer 100421 Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience with monitoring and automation tools. Responsibilities: Ensure the reliability,...


  • Old Toronto, Ontario, CA CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and Confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (9 months 4 days) Published 3 days ago New Relic Data Dog Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (9 months 4 days) Published 3 days ago New Relic Data Dog Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (5 months 29 days) Published 8 months ago CLOSED GCP Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will analyze...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (5 months 29 days) Published 8 months ago CLOSED GCP Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will analyze...


  • Old Toronto, Canada eTeam Full time

    Remote Work Duration 4 months - Preference is to find candidates who are willing to be converted to full-time employees. The conversion decision will be made based on performance. Job Description Role Description: Defining and measuring reliability goals—SLIs, SLOs, and error budgets for user journey. Designing for and implementing observability (ELK,...


  • Old Toronto, Canada eTeam Full time

    Remote Work Duration 4 months - Preference is to find candidates who are willing to be converted to full-time employees. The conversion decision will be made based on performance. Job Description Role Description: Defining and measuring reliability goals—SLIs, SLOs, and error budgets for user journey. Designing for and implementing observability (ELK,...


  • Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada Rogers Full time

    Site Reliability Engineer Are you ready to take your career to new heights and be a part of a dynamic team at Rogers Sports & Media? We believe in creativity, innovation, and collaboration in everything we do, and we are looking for people who share this mindset to join us. With a monthly reach of 30 million Canadians, you can help shape the future of...


  • Old Toronto, Canada Rogers Communications, Inc. Full time

    Site Reliability EngineerAre you ready to take your career to new heights and be a part of a dynamic team at Rogers Sports & Media? We believe in creativity, innovation, and collaboration in everything we do, and we are looking for people who share this mindset to join us. With a monthly reach of 30 million Canadians, you can help shape the future of sports,...


  • Old Toronto, Canada Rogers Communications, Inc. Full time

    Site Reliability EngineerAre you ready to take your career to new heights and be a part of a dynamic team at Rogers Sports & Media? We believe in creativity, innovation, and collaboration in everything we do, and we are looking for people who share this mindset to join us. With a monthly reach of 30 million Canadians, you can help shape the future of sports,...


  • Old Toronto, Canada Rogers Full time

    Site Reliability Engineer Are you ready to take your career to new heights and be a part of a dynamic team at Rogers Sports & Media? We believe in creativity, innovation, and collaboration in everything we do, and we are looking for people who share this mindset to join us. With a monthly reach of 30 million Canadians, you can help shape the future of...