Lead Site Reliability Engineer
7 days ago
Job Title: Lead Site Reliability Engineer – Banking Domain (Wealth Management Preferred)
Location: Toronto Downtown, ON (Onsite – 5 Days/Week)
Duration: Contract
Experience: 14+ Years
About the Role:
We are looking for a highly skilled Site Reliability Engineering (SRE) Lead with a strong background in the Banking domain, ideally within Wealth Management. The ideal candidate will lead the SRE function to ensure system reliability, scalability, and performance across mission-critical financial applications. This role involves hands-on technical expertise combined with leadership responsibilities to drive service excellence and operational efficiency.
Key Responsibilities:
· Lead and mentor a team of SREs responsible for production stability, reliability, and availability of banking and wealth management systems.
· Design and implement monitoring, alerting, and incident response strategies to proactively manage system health.
· Collaborate with development and infrastructure teams to drive DevOps and automation initiatives, ensuring smooth CI/CD pipelines.
· Define and implement SLIs, SLOs, and SLAs to measure and improve service performance.
· Manage and drive incident management, root cause analysis (RCA), and problem resolution to ensure minimal downtime and business impact.
· Lead capacity planning, performance tuning, and disaster recovery strategies.
· Drive observability and resilience engineering best practices across all platforms.
· Work closely with stakeholders in banking and wealth management domains to align reliability goals with business needs.
· Establish governance processes and ensure compliance with financial regulatory and security standards.
· Develop dashboards and reporting metrics to provide visibility into system performance and reliability.
· Champion a culture of continuous improvement, automation, and reliability-first mindset.
Required Skills & Experience:
· 10+ years of total IT experience, with at least 4+ years in Site Reliability Engineering or Production Operations leadership roles.
· Strong domain experience in Banking, with exposure to Wealth Management systems (highly desirable).
· Expertise in Linux/Unix administration, networking, and cloud infrastructure (AWS, Azure, or GCP).
· Strong scripting and automation experience (Python, Shell, or similar).
· Proficiency in monitoring and observability tools such as Prometheus, Grafana, Splunk, ELK, AppDynamics, or Dynatrace.
· Experience with CI/CD pipelines, Git, Jenkins, Ansible, Terraform, or equivalent tools.
· In-depth understanding of incident, problem, and change management based on ITIL principles.
· Proven track record in managing production systems supporting large-scale, high-availability financial applications.
· Excellent communication, stakeholder management, and team leadership skills.
-
Site Reliability Engineer
3 hours ago
Toronto, Ontario, Canada Procom Full time $80,000 - $120,000 per yearSite Reliability Engineer (SRE)/ Ingénieur Fiabilité des SitesOn behalf of our banking client, Procom is seeking a Site Reliability Engineer (SRE) for a 12-month contract role. This position is a hybrid role, 3 days a week onsite at our client's Montréal, Quebec office.Site Reliability Engineer - Job Description:The Site Reliability Engineer is...
-
Site Reliability Engineer
5 days ago
Toronto, Ontario, Canada Tecsys Inc. Full time $85,000 - $130,000 per yearHaving recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...
-
Site Reliability Engineer
2 weeks ago
Toronto, Ontario, Canada Tekgence Inc Full time $80,000 - $120,000 per yearHello,Please find the Job Description belowSite Reliability Engineering (SRE)Toronto ONSkills Required: Digital : Python Digital : Google Cloud Digital : Site Reliability Engineering (SRE)Job Description:Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault findingPartner with development teams to...
-
Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada Pixomondo Full time $120,000 - $180,000 per yearWe're seeking an experienced Site Reliability Engineer to join our team and lead infrastructure automation, CI/CD workflows, and deployment operations for a custom web platform. You'll be working with a modern DevOps stack including GitHub Actions, GCP, Kubernetes, Terraform, PostgreSQL, CodeDeploy, and Cloudflare to ensure our platform is robust, scalable,...
-
Site Reliability Engineer
4 hours ago
Toronto, Ontario, Canada Maneva Full time US$80,000 - US$120,000 per yearAbout ManevaManeva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....
-
Site Reliability Engineer
5 days ago
Toronto, Ontario, Canada Kablamo Full time $90,000 - $120,000 per yearReports to: Technical Support ManagerLocation: Toronto (Hybrid)Role Type: Full timeLevel: Intermediate/MidIntroductionKablamo is a fast-growing cloud digital product development company. Founded in 2017 in Australia, the business has grown quickly over the last several years, including the expansion of the team to Canada in 2021. We are proud to have...
-
Site Reliability Engineer
1 hour ago
Toronto, Ontario, Canada Xplor Full time $125,000 - $150,000Company Description Take a seat on the Xplor rocketship and join us as Site Reliability Engineer to help people succeed across the world.From dropping your kids off at childcare, getting something at home repaired, going to the gym or a fitness studio, to picking up your dry cleaning — our software, payments, and commerce-enabling solutions help everyday...
-
Site Reliability Engineer
1 hour ago
Toronto, Ontario, Canada McCain Foods Full time $102,700 - $137,000 per yearPosition Title:Site Reliability EngineerPosition Type:Regular - Full-TimePosition Location:Toronto HQRequisition ID:36904Our Global Technology team's goal is to leverage technology and data to drive profitable growth, focus on enhancing customer experience and to further our purpose of 'Celebrating real connections through delicious, planet-friendly food'....
-
Site Reliability Engineer
2 hours ago
Toronto, Ontario, Canada Apptoza Inc. Full time $30,000 - $120,000 per yearHI,Hope you are doing Great,If you are fine with below JD please share me your Updated resume ASAP.Site Reliability EngineerLocation: TORONTO (ONSITE)Duration: 6 monthsExp Required: 10 YearsJob Description: Job Title : SRETechnical/Functional Skills• 8+ years of overall IT experience.• Advanced Linux / Unix support experience required.• Strong shell...
-
Senior Site Reliability Engineer
7 days ago
Toronto, Ontario, Canada 3cf5cb8c-b08d-42c2-a6cd-1ee0c7026e02 Full time $120,000 - $180,000 per yearAbout Us:Zensurance is redefining commercial insurance for Canadian businesses.As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive.Zensurance has...