Associate Director, Automation/SRE, Production Engineering

1 day ago


Toronto, Ontario, Canada RBC Full time $120,000 - $200,000 per year

Job Description
What is the opportunity?
This role reports to the Associate Director, Production Engineering Lead within the Production and Risk Services (P&RS). As part of the team, you will be help operationalize her strategy to embed SRE (Site Reliability Engineering) principles and best practices to drive down TOIL and maintenance cost. This Associate Director, Automation/SRE-Production Engineering role will be vital in shaping the SRE function within P&RS (and wider Quantitative and Technology Services that P&RS is part of) to improve efficiency, recovery and resiliency.

What will you do?

  • Report directly to the Production Engineering Lead, partenering closely to prioritize and address reliability and supportability gaps.
  • Opportunity to work with RBC's Predictive Engineering tool.
  • Focus on solving our operational challenges tasks (improving supportability / enhancements on operating a 24/7 environment); you'll support application written in Java, .NET, C++, and web stacks across cloud, on-prem Linux/Windows, and containerized environments.
  • Engineer and maintain API integrations (e.g. REST/SOAP, OAuth, OpenAPI/Swagger) to enable robust inter-service communication
  • Enhance and maintain database reliability and performance across relational(Postgres SQL, Oracle) and NoSQL (MongoDb, Cassandra, Redis) platform- qyuery optimization, replication relsilience, automation.
  • Develop and manage automation tools and frameworks to reduce operational toil and build self-healing workflows; implement monitoring, alerting, SLI/SLO's, dashboards usiong tools like ITRS geneos, Prometheus, Grafana , ELK, Dynatrace.
  • Apply long-term corrective actions and automations for recurring incidents; embed resilience patterns, - retries, circuit breaker, graceful degradation into services and systems deployed on cloud(Azure, AWS), Linux/Windows on-prem , and containers(Kubernetes/Docker).
  • Contribute to documentation and adoption of reliability best practices across development abds support teams; identify reliability gaps, help build standardized frameworks and clos those gaps through automation, monitoring , API/ database integrations, and self-healing systems.
  • Utilize effective and efficient sustainment in supportability engineering to help automate manual tasks (alleviate TOIL), invent tools to help improve issue investigations and recovery such as intelligent operations/AI Ops.
  • Lead in Disciplined Operations and governance - best practices and production standards on our toolkit, build and set templates for monitoring, batch controls, runbooks etc.
  • Integrate Chaos Engineering best practices and outline front to back integration testing standards, accepting failure as norm.

What do you need to succeed?
Must have:

  • Bachelor's degree in Computer Science, Engineering, or equivalent experience
  • 5+ years experience of software development, including 5+ years in Core Java in production applications
  • 2+ Experience with SRE (Site Reliability Engineering) principles and practices
  • Proven experience with API integrations and inter-service communication frameworks (e.g., Spring REST, Swagger/OpenAPI)
  • Real world experience with database systems: SQL - designing for performance, automating operations
  • Strong coding and scripting abilities (Java, Python, Bash) to build automation tooling
  • Familiarity with Observability tooling, any one of…like ITRS Geneos, Prometheus, Grafana, ELK stack, Dynatrace
  • Solid understanding of working with Linux and Windows OS, Networking, Container orchestration (Kubernetes), and cloud architecture
  • Must be extremely hands on, detail oriented, assertive and proactive with both day-to-day tasks and short and long-term deliveries
  • Proven ability to collaborate well with others, be strategically focused and realize continuous improvements
  • Good organization skills , ability to effectively context switch and thrive in a fast pace environment
  • Strong interpersonal skills and self-starter attitude
  • Excellent verbal and written communication skills

Nice to have:

  • Cloud certification (Azure, AWS) or Kubernetes certification(CKA)
  • Exposure to chaos engineering, GitOps workflows(ArgoCD, Flux, Helios), and polatform level error budgeting
  • Prior in-depth experience in supporting large-scale, enterprise-wide, global trading and risk platforms

What is in it for you?
We thrive on the challenge to be our best - progressive thinking to keep growing and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.

  • A comprehensive Total Rewards Program including bonuses, flexible benefits and competitive compensation
  • Leaders who support your development through coaching and managing opportunities
  • Opportunities to work with the best in the field
  • Ability to make a difference and lasting impact
  • Work in a dynamic, collaborative, progressive, and high-performing team
  • A world-class training program in financial services
  • Flexible working options fully supported.

Job Skills

  • NET Micro Framework, .NET Micro Framework, Agile Methodology, Artificial Intelligence (AI), ASP.NET C#, Atlassian Confluence, BMC Control-M, Cloud Computing, Docker (Software), Dynatrace APM, Elastic Stack (ELK), GitHub Repositories, Grafana, Group Problem Solving, Helm (Tool), ITRS Geneos, IT Systems Integration, Java, Java Software Development, Java Spring, Kubernetes, Linux, Linux Bash Scripting, Microsoft Azure, MongoDB {+ 20 more}

Additional Job Details
Address:
RBC CENTRE, 155 WELLINGTON ST W:TORONTO

City:
Toronto

Country:
Canada

Work hours/week:
37.5

Employment Type:
Full time

Platform:
CAPITAL MARKETS

Job Type:
Regular

Pay Type:
Salaried

Posted Date:

Application Deadline:

Note
:
Applications will be accepted until 11:59 PM on the day prior to the application deadline date above
I
*nclusion*
and Equal Opportunity Employment
At RBC, we believe an inclusive workplace that has diverse perspectives is core to our continued growth as one of the largest and most successful banks in the world. Maintaining a workplace where our employees feel supported to perform at their best, effectively collaborate, drive innovation, and grow professionally helps to bring our Purpose to life and create value for our clients and communities. RBC strives to deliver this through policies and programs intended to foster a workplace based on respect, belonging and opportunity for all.

Join our Talent Community
Stay in-the-know about great career opportunities at RBC. Sign up and get customized info on our latest jobs, career tips and Recruitment events that matter to you.

Expand your limits and create a new future together at RBC. Find out how we use our passion and drive to enhance the well-being of our clients and communities



  • Toronto, Ontario, Canada Realign Full time $120,000 - $200,000 per year

    Job Type: ContractJob Category: ITJob DescriptionJob Title: SRE Lead – Banking Domain (Wealth Management Preferred)Location: Toronto Downtown, ON (Onsite – 5 Days/Week)Experience: 10+ YearsAbout the Role:We are looking for a highly skilled Site Reliability Engineering (SRE) Lead with a strong background in the Banking domain, ideally within Wealth...


  • Toronto, Ontario, Canada Serigor Full time $90,000 - $120,000 per year

    Company Description Serigor is all about helping you make the right decision about the right technical support for the right fineness in management utilities at any time in a firm standing. Serigor helps organizations stay ahead by building sustainable competitive advantage. Job Description The SRE Role· SREs are engineers with the right mix of knowledge...

  • Senior SRE

    7 days ago


    Toronto, Ontario, Canada J&M Group Full time $120,000 - $180,000 per year

    Job DescriptionWhat is the OpportunityAs a Senior Site Reliability Engineer you will bring the engineering mindset of bold ambition curiosity and outcome focus to ensuring the performance and reliability of our systems This role calls for a dynamic individual who excels in a collaborative environment interacting with cross functional teams to establish best...


  • Toronto, Ontario, Canada Univeris Full time $130,000 - $145,000 per year

    Who we are Univeris has been successfully building and modernizing wealth management software for over 30 years. We are the go-to solution for many financial services firms (e.g., mutual fund dealers, securities dealers, wealth management firms and insurance companies) that want to stay competitive in an intriguingly digital world. To date, more than 25,000...


  • Toronto, Ontario, Canada JP Techno Park Full time $110,000 - $150,000 per year

    We're currently hiring for a Senior DevOps / Site Reliability Engineer (SRE) with 10+ years of experience in SRE, DevOps, or technical operations supporting production systems. We're looking for someone who has: Experience leading go-live and operational readiness for real-time or high-stakes platforms (fraud, risk, or payments preferred). Strong...


  • Toronto, Ontario, Canada Univeris Full time $120,000 - $180,000 per year

    Who we are Univeris has been successfully building and modernizing wealth management software for over 30 years. We are the go-to solution for many financial services firms (e.g., mutual fund dealers, securities dealers, wealth management firms and insurance companies) that want to stay competitive in an intriguingly digital world. To date, more than 25,000...


  • Toronto, Ontario, Canada Infosprint Technologies Full time $150,000 - $200,000 per year

    SRE DevOps ArchitectToronto- HybridWe're currently hiring for a Senior Architect, DevOps / Site Reliability Engineer (SRE) with 10+ years of experience in SRE, DevOps, or technical operations supporting production systems.We're looking for someone who has: Experience leading go-live and operational readiness for real-time or high-stakes platforms (fraud,...


  • Toronto, Ontario, Canada Calance Full time $140,000 - $180,000 per year

    Job Title:Director, AI Engineering Operations & Data EngineeringLocation:Toronto, CanadaOnsite:HybridMust be authorized to work in Canada: No sponsorship providedPay Rate:$ k CAD Annual Salary.About the RoleThis role oversees the strategic direction and execution of both our core Data Engineering & Integrations function and a newly formed AI Engineering...


  • Toronto, Ontario, Canada RBC Full time $80,000 - $130,000 per year

    Job DescriptionWhat is the opportunity?As a Senior Digital Operations SRE, you will play a pivotal role in deploying, managing, and overseeing deployments into pre-production and production environments within the Digital Operations portfolio. Your primary responsibility will be to coordinate with development, testing, and operations teams to ensure that all...


  • Toronto, Ontario, Canada RBC Full time $104,000 - $200,000 per year

    Job DescriptionWhat is the opportunity?As a Senior Digital Operations SRE, you will play a pivotal role in deploying, managing, and overseeing deployments into pre-production and production environments within the Digital Operations portfolio. Your primary responsibility will be to coordinate with development, testing, and operations teams to ensure that all...