Site Reliability Engineer
6 days ago
Site Reliability Engineer (SRE) with expertise in Dynatrace monitoring, log investigation, and observability practices. The ideal candidate will have a deep understanding of business processes, upstream-downstream dependencies, and the ability to design and implement dashboards with SLOs and SLAs that align with business objec-tives.
Key Responsibilities
Monitoring Observability oConfigure and maintain Dynatrace for application and infrastructure monitoring. Develop custom dashboards, alerts, and reports to track system health and performance. Define and implement Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
Log Analysis Troubleshooting Perform log investigation using tools like Splunk, ELK, or similar platforms. Identify root causes of incidents and provide actionable insights for resolution.
Business Under-standing oAnalyze business models, workflows, and critical application flows. Map up-stream and downstream dependencies to ensure end-to-end reliability.
Incident Man-agement Participate in on-call rotations and respond to production incidents. Drive post-incident reviews and implement preventive measures.
Automation Optimization Automated monitoring and alerting processes to reduce manual intervention. Collabo-rate with development teams to improve system reliability and performance.
Required Skills Qualifications
Technical Expertise Strong experience with Dynatrace (configura-tion, dashboards, problem detection). Proficiency in log analysis tools (Splunk, ELK, or equivalent). Solid understanding of SRE principles, observability, and incident man-agement.
Business Analytical Skills Ability to understand business processes and translate them into technical monitoring solutions. Experience in mapping application dependencies and creating impact analysis.
Soft Skills Excellent communication and collaboration skills. Strong problem-solving and analytical mind-set.
Preferred oExperience with cloud platforms (AWS, Azure, GCP). Familiarity with CICD pipelines and automation scripting.
Performance Metrics Uptime and reliability improvements. Reduction in Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR). Accuracy and relevance of dashboards and alerts. Compliance with defined SLOs and SLAs.
Experience required: 10
Job Types: Full-time, Fixed term contract
Work Location: Hybrid remote in Toronto, ON
-
Site Reliability Engineer
17 hours ago
Toronto, Ontario, Canada Procom Full time $80,000 - $120,000 per yearSite Reliability Engineer (SRE)/ Ingénieur Fiabilité des SitesOn behalf of our banking client, Procom is seeking a Site Reliability Engineer (SRE) for a 12-month contract role. This position is a hybrid role, 3 days a week onsite at our client's Montréal, Quebec office.Site Reliability Engineer - Job Description:The Site Reliability Engineer is...
-
Site Reliability Engineer
17 hours ago
Toronto, Ontario, Canada Maneva Full time US$80,000 - US$120,000 per yearAbout ManevaManeva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....
-
Site Reliability Engineer
6 days ago
Toronto, Ontario, Canada Tecsys Inc. Full time $85,000 - $130,000 per yearHaving recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...
-
Site Reliability Engineer
15 hours ago
Toronto, Ontario, Canada Apptoza Inc. Full time $30,000 - $120,000 per yearHI,Hope you are doing Great,If you are fine with below JD please share me your Updated resume ASAP.Site Reliability EngineerLocation: TORONTO (ONSITE)Duration: 6 monthsExp Required: 10 YearsJob Description: Job Title : SRETechnical/Functional Skills• 8+ years of overall IT experience.• Advanced Linux / Unix support experience required.• Strong shell...
-
Site Reliability Engineer
15 hours ago
Toronto, Ontario, Canada Xplor Full time $125,000 - $150,000Company Description Take a seat on the Xplor rocketship and join us as Site Reliability Engineer to help people succeed across the world.From dropping your kids off at childcare, getting something at home repaired, going to the gym or a fitness studio, to picking up your dry cleaning — our software, payments, and commerce-enabling solutions help everyday...
-
Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada Pixomondo Full time $120,000 - $180,000 per yearWe're seeking an experienced Site Reliability Engineer to join our team and lead infrastructure automation, CI/CD workflows, and deployment operations for a custom web platform. You'll be working with a modern DevOps stack including GitHub Actions, GCP, Kubernetes, Terraform, PostgreSQL, CodeDeploy, and Cloudflare to ensure our platform is robust, scalable,...
-
Site Reliability Engineer
6 days ago
Toronto, Ontario, Canada Kablamo Full time $90,000 - $120,000 per yearReports to: Technical Support ManagerLocation: Toronto (Hybrid)Role Type: Full timeLevel: Intermediate/MidIntroductionKablamo is a fast-growing cloud digital product development company. Founded in 2017 in Australia, the business has grown quickly over the last several years, including the expansion of the team to Canada in 2021. We are proud to have...
-
Site Reliability Engineer
15 hours ago
Toronto, Ontario, Canada McCain Foods Full time $102,700 - $137,000 per yearPosition Title:Site Reliability EngineerPosition Type:Regular - Full-TimePosition Location:Toronto HQRequisition ID:36904Our Global Technology team's goal is to leverage technology and data to drive profitable growth, focus on enhancing customer experience and to further our purpose of 'Celebrating real connections through delicious, planet-friendly food'....
-
Lead Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada AceStack Full time $120,000 - $200,000 per yearJob Title: Lead Site Reliability Engineer – Banking Domain (Wealth Management Preferred)Location: Toronto Downtown, ON (Onsite – 5 Days/Week)Duration: ContractExperience: 14+ YearsAbout the Role:We are looking for a highly skilled Site Reliability Engineering (SRE) Lead with a strong background in the Banking domain, ideally within Wealth Management. The...
-
Senior Site Reliability Engineer
15 hours ago
Toronto, Ontario, Canada RBC Full time $90,000 - $120,000 per yearJob DescriptionWhat is the opportunity?Join our Commercial, Core Banking and Payments Technology (CCBPT) team as a Senior Site Reliability Engineer, where you'll play a key role in supporting our cloud and distributed environments for the Personal Commercial Credit SRE & Ops team. This exciting opportunity will challenge you to work with cutting-edge...