Director of Site Reliability Engineering

6 days ago


Mississauga, Canada Element Fleet Management Full time
We are re-defining the fleet management industry to be people first, then business – delivering on our promise of a superior client experience. p>

What We Need

We are looking for a Director, Site Reliability Engineering to join Element Fleet Management. As the largest pure-play fleet manager in the world, we provide unmatched products and services and solutions to our clients.

Someone with experience using data analytics to drive decision-making for system improvements and incident prevention?

As the Director, Site Reliability Engineering, you will lead and manage our SRE team, working closely with cross-functional teams to implement and refine SRE practices, minimize downtime, and drive automation for high efficiency. You will bring a mix of operational and engineering expertise to design robust systems, oversee incident management, monitor key metrics, and foster a culture of continuous improvement. Provide ongoing training and development opportunities for team growth.

Incident Management and Response: Lead the team in incident response, coordinating with cross-functional stakeholders to ensure timely resolution. li>Problem Management: Analyze and address underlying issues in applications and systems to prevent recurring incidents. li>Change Management and Release Engineering: Implement and oversee change management practices, ensuring safe and reliable releases. Work closely with development and QA teams to standardize and optimize deployment pipelines for maximum reliability and scalability.Monitoring, Alerting, and Reporting: Build and maintain robust monitoring, logging, and alerting solutions for system health and application performance. li>Automation and Tooling: Drive the adoption of automation and self-healing systems to reduce manual intervention, improve efficiency, and minimize human error. Oversee the development of tools and frameworks to support automation in deployment, monitoring, and incident response.Capacity Planning and Disaster Recovery: Conduct capacity planning and manage resources to ensure systems can handle current and future demands. li>Audit and Compliance: Collaborate with internal and external audit teams to ensure that our production systems meet SOC1, SOX, and other regulatory requirements. li>Vendor Management: Manage relationships with external vendors to ensure they meet performance and service level agreements. li>

Requirements

  • Bachelor's degree in computer science, engineering, or a related field; li>
  • 10+ years of experience in IT operations, SRE, or related field, with a strong record of managing high-availability systems in production environments.
  • Solid understanding of SRE principles and practices, including error budgets, service level objectives (SLOs), and service level indicators (SLIs).
  • Strong background in automation, CI/CD, and DevOps practices, with experience using tools such as Jenkins, GitLab CI/CD, or similar.
  • Experience with observability tools such as Prometheus, Grafana, ELK Stack, Splunk, or DataDog, and the ability to design, implement, and interpret monitoring and alerting systems.
  • Proven ability to lead and manage incident response and post-incident analysis, with a focus on improving response times and reducing incident frequency.
  • Proficiency in scripting and programming languages such as Python, Go, or Bash, with an ability to build automation scripts and tooling.
  • Familiarity with SOC1, SOX, and other regulatory compliance frameworks, and experience in maintaining audit and compliance documentation.
  • Strong project management skills with a focus on prioritization, resource planning, and risk assessment.

Nice-to-Have Skills

  • Google Cloud Professional DevOps Engineer, AWS Certified DevOps Engineer, or Certified Kubernetes Administrator (CKA)
  • ITIL Certification, ITSM Certification, or PMP certification
  • Familiarity with advanced SRE tools and practices such as chaos engineering, load testing, and synthetic monitoring
  • Experience managing third-party relationships to ensure vendors meet performance and service level expectations
  • Hands-on experience in coordinating with audit teams for compliance documentation and requirements.


  • Mississauga, Canada CEI Fleet Collision and Safety Full time

    h3>Director, Site Reliability Engineering Apply locations Mississauga time type Full time posted on Posted 2 Days Ago job requisition id R104373 We are looking for a Director, Site Reliability Engineering to join Element Fleet Management. As the largest pure-play fleet manager in the world, we provide unmatched products and services and solutions to our...


  • Mississauga, Canada CEI Fleet Collision and Safety Full time

    h3>Director, Site Reliability EngineeringApply locations Mississauga time type Full time posted on Posted 3 Days Ago job requisition id R104373Get started on an exciting career at Element!What We NeedWe are looking for a Director, Site Reliability Engineering to join Element Fleet Management. As the largest pure-play fleet manager in the world, we provide...


  • Mississauga, Ontario, Canada Interesting Engineering, Inc. Full time

    Job Title: Technical Engineering Director">We are seeking a highly experienced and skilled Senior Site Reliability Engineer to lead our team in optimizing our customer experience management platforms.About the Role:">">Develop and implement strategic plans to achieve low and continuously improving mean time to recovery (MTTR) for service-impacting...


  • Mississauga, Ontario, Canada CEI Fleet Collision and Safety Full time

    We are seeking an experienced Site Reliability Engineering Team Lead to join our team at CEI Fleet Collision and Safety.Job Description:As a Director of Site Reliability Engineering, you will lead and manage our SRE team, working closely with cross-functional teams to implement and refine SRE practices, minimize downtime, and drive automation for high...


  • Mississauga, Ontario, Canada KUBRA Full time

    About the RoleWe are seeking a seasoned Site Reliability Engineering Lead to join our team at KUBRA, a fast-growing company delivering customer communications solutions to leading utility, insurance, and government entities across North America. This is an exciting opportunity for a skilled engineer to drive our DevOps team in optimizing customer experience...


  • Mississauga, Ontario, Canada KUBRA Full time

    Enhance Platform Stability and EfficiencyAward-winning company KUBRA seeks a skilled Team Lead, Site Reliability Engineer to join our dynamic team in Mississauga, ON.About the RoleWe are growing rapidly and looking for an experienced professional to guide our DevOps team in optimizing our customer experience management platforms. As a Site Reliability...


  • Mississauga, Ontario, Canada KUBRA Full time

    Job DescriptionIn this dynamic role, you will work collaboratively with cross-functional teams to apply SRE principles and drive continuous improvement. As a seasoned Site Reliability Engineer, your technical expertise will be pivotal in identifying potential issues, resolving complex problems, and leading technical and business...


  • Mississauga, Ontario, Canada KUBRA Full time

    Job OverviewWe are seeking a skilled Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms.


  • Mississauga, Ontario, Canada KUBRA Full time

    About the RoleWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms.Job DescriptionKey Responsibilities:Ensure infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).Maintain well-documented standards and best...


  • Mississauga, Ontario, Canada KUBRA Full time

    About KUBRAKUBRA is a fast-growing company that delivers customer communications solutions to some of the largest utility, insurance, and government entities across North America. We offer billing and payments, mapping, mobile apps, proactive communications, and artificial intelligence solutions for customers.With more than 1.5 billion customer interactions...


  • Mississauga, Ontario, Canada KUBRA Full time

    About KUBRAKUBRA is a leading provider of billing and payments, mapping, mobile apps, proactive communications, and artificial intelligence solutions for customers.Job Title: Site Reliability Engineering Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management...


  • Mississauga, Ontario, Canada KUBRA Full time

    We are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms. The ideal candidate will have a passion for enhancing platform stability, reliability, and efficiency.About the RoleAs a Team Lead, Site Reliability Engineer, you will work collaboratively with cross-functional teams to...


  • Mississauga, Ontario, Canada KUBRA Full time

    We are growing at KUBRA, a company that offers billing and payments, mapping, mobile apps, proactive communications, and artificial intelligence solutions for customers. Our team is looking for a skilled Site Reliability Engineer Team Lead to guide our DevOps team in optimizing customer experience management platforms.About the RoleAs a Site Reliability...


  • Mississauga, Ontario, Canada KUBRA Full time

    We are growing at KUBRA, a leading provider of billing and payments, mapping, mobile apps, proactive communications, and artificial intelligence solutions. Our office is the perfect blend of creativity and stability, offering a casual work environment, competitive compensation, and a stellar benefits program.About the RoleWe are seeking an experienced Site...


  • Mississauga, Ontario, Canada KUBRA Full time

    We are seeking an experienced Senior DevOps Engineer to lead our team in optimizing customer experience management platforms. As a Site Reliability Engineering Team Lead, you will guide cross-functional teams in applying SRE principles and driving continuous improvement.This dynamic role involves working collaboratively with teams to identify potential...


  • Mississauga, Ontario, Canada KUBRA Full time

    About UsKUBRA is a leading provider of customer communications solutions to top utility, insurance, and government entities across North America.Job OverviewWe are seeking an experienced Site Reliability Engineer to join our team as a Team Lead. This role will oversee the optimization of our customer experience management platforms, ensuring high...


  • Mississauga, Ontario, Canada KUBRA Full time

    We are growing at KUBRA, and we're seeking a skilled Team Lead, Site Reliability Engineer, to guide our DevOps team in optimizing our customer experience management platforms.Your technical expertise will be pivotal in identifying potential issues, resolving complex problems, and leading technical and business discussions.About the RoleID:** Site Reliability...


  • Mississauga, Ontario, Canada KUBRA Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. This is an exciting opportunity to work with a talented team and make a significant impact on our company's growth.


  • Mississauga, Ontario, Canada KUBRA Full time

    Job Title: Site Reliability Engineering ManagerAbout the Role:We are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms. This is a hybrid opportunity based in Mississauga, ON.Your Responsibilities:Ensure infrastructure and applications perform within established Service Level...


  • Mississauga, Ontario, Canada KUBRA Full time

    We are seeking a seasoned Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a key member of our team, you will leverage your technical expertise to identify potential issues, resolve complex problems, and drive technical discussions.Key ResponsibilitiesGuide the DevOps team in implementing...