Director of Site Reliability Engineering
6 days ago
What We Need
We are looking for a Director, Site Reliability Engineering to join Element Fleet Management. As the largest pure-play fleet manager in the world, we provide unmatched products and services and solutions to our clients.
Someone with experience using data analytics to drive decision-making for system improvements and incident prevention?As the Director, Site Reliability Engineering, you will lead and manage our SRE team, working closely with cross-functional teams to implement and refine SRE practices, minimize downtime, and drive automation for high efficiency. You will bring a mix of operational and engineering expertise to design robust systems, oversee incident management, monitor key metrics, and foster a culture of continuous improvement. Provide ongoing training and development opportunities for team growth.
Incident Management and Response: Lead the team in incident response, coordinating with cross-functional stakeholders to ensure timely resolution. li>Problem Management: Analyze and address underlying issues in applications and systems to prevent recurring incidents. li>Change Management and Release Engineering: Implement and oversee change management practices, ensuring safe and reliable releases. Work closely with development and QA teams to standardize and optimize deployment pipelines for maximum reliability and scalability.Monitoring, Alerting, and Reporting: Build and maintain robust monitoring, logging, and alerting solutions for system health and application performance. li>Automation and Tooling: Drive the adoption of automation and self-healing systems to reduce manual intervention, improve efficiency, and minimize human error. Oversee the development of tools and frameworks to support automation in deployment, monitoring, and incident response.Capacity Planning and Disaster Recovery: Conduct capacity planning and manage resources to ensure systems can handle current and future demands. li>Audit and Compliance: Collaborate with internal and external audit teams to ensure that our production systems meet SOC1, SOX, and other regulatory requirements. li>Vendor Management: Manage relationships with external vendors to ensure they meet performance and service level agreements. li>Requirements
- Bachelor's degree in computer science, engineering, or a related field; li>
- 10+ years of experience in IT operations, SRE, or related field, with a strong record of managing high-availability systems in production environments.
- Solid understanding of SRE principles and practices, including error budgets, service level objectives (SLOs), and service level indicators (SLIs).
- Strong background in automation, CI/CD, and DevOps practices, with experience using tools such as Jenkins, GitLab CI/CD, or similar.
- Experience with observability tools such as Prometheus, Grafana, ELK Stack, Splunk, or DataDog, and the ability to design, implement, and interpret monitoring and alerting systems.
- Proven ability to lead and manage incident response and post-incident analysis, with a focus on improving response times and reducing incident frequency.
- Proficiency in scripting and programming languages such as Python, Go, or Bash, with an ability to build automation scripts and tooling.
- Familiarity with SOC1, SOX, and other regulatory compliance frameworks, and experience in maintaining audit and compliance documentation.
- Strong project management skills with a focus on prioritization, resource planning, and risk assessment.
Nice-to-Have Skills
- Google Cloud Professional DevOps Engineer, AWS Certified DevOps Engineer, or Certified Kubernetes Administrator (CKA)
- ITIL Certification, ITSM Certification, or PMP certification
- Familiarity with advanced SRE tools and practices such as chaos engineering, load testing, and synthetic monitoring
- Experience managing third-party relationships to ensure vendors meet performance and service level expectations
- Hands-on experience in coordinating with audit teams for compliance documentation and requirements.
-
Director of Site Reliability Engineering
4 weeks ago
Mississauga, Canada CEI Fleet Collision and Safety Full timeh3>Director, Site Reliability Engineering Apply locations Mississauga time type Full time posted on Posted 2 Days Ago job requisition id R104373 We are looking for a Director, Site Reliability Engineering to join Element Fleet Management. As the largest pure-play fleet manager in the world, we provide unmatched products and services and solutions to our...
-
Director of Site Reliability Engineering
3 weeks ago
Mississauga, Canada CEI Fleet Collision and Safety Full timeh3>Director, Site Reliability EngineeringApply locations Mississauga time type Full time posted on Posted 3 Days Ago job requisition id R104373Get started on an exciting career at Element!What We NeedWe are looking for a Director, Site Reliability Engineering to join Element Fleet Management. As the largest pure-play fleet manager in the world, we provide...
-
Technical Engineering Director
1 month ago
Mississauga, Ontario, Canada Interesting Engineering, Inc. Full timeJob Title: Technical Engineering Director">We are seeking a highly experienced and skilled Senior Site Reliability Engineer to lead our team in optimizing our customer experience management platforms.About the Role:">">Develop and implement strategic plans to achieve low and continuously improving mean time to recovery (MTTR) for service-impacting...
-
Site Reliability Engineering Team Lead
3 weeks ago
Mississauga, Ontario, Canada CEI Fleet Collision and Safety Full timeWe are seeking an experienced Site Reliability Engineering Team Lead to join our team at CEI Fleet Collision and Safety.Job Description:As a Director of Site Reliability Engineering, you will lead and manage our SRE team, working closely with cross-functional teams to implement and refine SRE practices, minimize downtime, and drive automation for high...
-
Site Reliability Engineering Lead
2 days ago
Mississauga, Ontario, Canada KUBRA Full timeAbout the RoleWe are seeking a seasoned Site Reliability Engineering Lead to join our team at KUBRA, a fast-growing company delivering customer communications solutions to leading utility, insurance, and government entities across North America. This is an exciting opportunity for a skilled engineer to drive our DevOps team in optimizing customer experience...
-
Site Reliability Engineering Manager
3 days ago
Mississauga, Ontario, Canada KUBRA Full timeEnhance Platform Stability and EfficiencyAward-winning company KUBRA seeks a skilled Team Lead, Site Reliability Engineer to join our dynamic team in Mississauga, ON.About the RoleWe are growing rapidly and looking for an experienced professional to guide our DevOps team in optimizing our customer experience management platforms. As a Site Reliability...
-
Site Reliability Engineering Manager
7 days ago
Mississauga, Ontario, Canada KUBRA Full timeJob DescriptionIn this dynamic role, you will work collaboratively with cross-functional teams to apply SRE principles and drive continuous improvement. As a seasoned Site Reliability Engineer, your technical expertise will be pivotal in identifying potential issues, resolving complex problems, and leading technical and business...
-
Site Reliability Engineer Leader
3 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeJob OverviewWe are seeking a skilled Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms.
-
Site Reliability Engineering Manager
1 week ago
Mississauga, Ontario, Canada KUBRA Full timeAbout the RoleWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms.Job DescriptionKey Responsibilities:Ensure infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).Maintain well-documented standards and best...
-
Site Reliability Engineering Manager
2 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeAbout KUBRAKUBRA is a fast-growing company that delivers customer communications solutions to some of the largest utility, insurance, and government entities across North America. We offer billing and payments, mapping, mobile apps, proactive communications, and artificial intelligence solutions for customers.With more than 1.5 billion customer interactions...
-
Site Reliability Engineering Team Lead
3 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeAbout KUBRAKUBRA is a leading provider of billing and payments, mapping, mobile apps, proactive communications, and artificial intelligence solutions for customers.Job Title: Site Reliability Engineering Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management...
-
Site Reliability Engineer Leadership Position
3 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms. The ideal candidate will have a passion for enhancing platform stability, reliability, and efficiency.About the RoleAs a Team Lead, Site Reliability Engineer, you will work collaboratively with cross-functional teams to...
-
Site Reliability Engineer Team Lead
4 days ago
Mississauga, Ontario, Canada KUBRA Full timeWe are growing at KUBRA, a company that offers billing and payments, mapping, mobile apps, proactive communications, and artificial intelligence solutions for customers. Our team is looking for a skilled Site Reliability Engineer Team Lead to guide our DevOps team in optimizing customer experience management platforms.About the RoleAs a Site Reliability...
-
Site Reliability Engineering Team Lead
2 days ago
Mississauga, Ontario, Canada KUBRA Full timeWe are growing at KUBRA, a leading provider of billing and payments, mapping, mobile apps, proactive communications, and artificial intelligence solutions. Our office is the perfect blend of creativity and stability, offering a casual work environment, competitive compensation, and a stellar benefits program.About the RoleWe are seeking an experienced Site...
-
Site Reliability Engineering Team Lead
2 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeWe are seeking an experienced Senior DevOps Engineer to lead our team in optimizing customer experience management platforms. As a Site Reliability Engineering Team Lead, you will guide cross-functional teams in applying SRE principles and driving continuous improvement.This dynamic role involves working collaboratively with teams to identify potential...
-
Site Reliability Engineer
2 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeAbout UsKUBRA is a leading provider of customer communications solutions to top utility, insurance, and government entities across North America.Job OverviewWe are seeking an experienced Site Reliability Engineer to join our team as a Team Lead. This role will oversee the optimization of our customer experience management platforms, ensuring high...
-
Site Reliability Engineering Team Lead
3 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeWe are growing at KUBRA, and we're seeking a skilled Team Lead, Site Reliability Engineer, to guide our DevOps team in optimizing our customer experience management platforms.Your technical expertise will be pivotal in identifying potential issues, resolving complex problems, and leading technical and business discussions.About the RoleID:** Site Reliability...
-
Site Reliability Engineering Team Lead
2 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. This is an exciting opportunity to work with a talented team and make a significant impact on our company's growth.
-
Site Reliability Engineering Manager
2 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeJob Title: Site Reliability Engineering ManagerAbout the Role:We are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms. This is a hybrid opportunity based in Mississauga, ON.Your Responsibilities:Ensure infrastructure and applications perform within established Service Level...
-
Site Reliability Engineering Team Lead
2 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeWe are seeking a seasoned Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a key member of our team, you will leverage your technical expertise to identify potential issues, resolve complex problems, and drive technical discussions.Key ResponsibilitiesGuide the DevOps team in implementing...