Site Reliability Engineer Leader

4 days ago


Mississauga, Ontario, Canada Devopshunt Full time
Job Overview

We are seeking a highly skilled Site Reliability Engineer Leader to join our team at Devopshunt. As a key member of our DevOps team, you will be responsible for ensuring the stability, reliability, and efficiency of our customer experience management platforms.

Key Responsibilities
  • Ensure that infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).
  • Maintain well-documented standards and best practices to ensure services are built for high availability and security.
  • Implement appropriate automation and observability to achieve low and continuously improving mean time to recovery (MTTR) for service-impacting incidents.
  • Document any incidents thoroughly, along with corresponding problem records and corrective actions.
  • Participate in the Architectural Review Process for new and existing services, ensuring compliance with high-availability, observability, security, and cost efficiency standards.
  • Enhance governance processes to ensure all platform components meet current standards.
  • Lead root cause analysis for major incidents, communicating with senior stakeholders, driving problem-solving, and debugging using best practice techniques.
  • Design and conduct fault injection experiments to identify potential weak points in high-availability architecture and work with engineering teams to remediate findings.
  • Collaborate with engineering teams to optimize infrastructure for security, resiliency, and cost targets based on collected feedback.
  • Document processes and maintain records related to infrastructure procedures and strategies, ensuring appropriate alerts and support procedures are in place for quick incident remediation.
Requirements
  • Bachelor's degree in Computer Science, Engineering, Information Technology, or equivalent experience.
  • 5+ years of experience in site reliability engineering or a related field.
  • Proven leadership and team management experience.
  • Experience with systems programming languages, such as Go or Python, and shell scripting.
  • Proficient with Terraform and infrastructure as code principles.
  • Demonstrated proficiency in public cloud environments, particularly AWS.
  • Hands-on experience with Kubernetes management within AWS EKS.
  • Experience with CI/CD automation tools, such as CircleCI and ArgoCD.
  • Experience with monitoring and logging using tools like Prometheus, Grafana, Open Telemetry, CloudWatch, and Honeycomb.
  • AWS and Kubernetes Certifications (Solutions Architect, SysOps Administrator, DevOps Engineer, CKA, CKS, CKAD, KCNA) are desirable.
About Devopshunt

Devopshunt is a fast-growing company that delivers customer communications solutions to some of the largest utility, insurance, and government entities across North America. We offer a casual work environment, competitive compensation, and a stellar benefits program. Our office is small enough to allow creative individuals to flourish, yet large enough to provide long-term stability.

We place a tremendous amount of responsibility on our team members to be productive, focused, and self-motivated. We are an equal opportunity employer dedicated to building an inclusive and diverse workforce. We will provide accommodations during the recruitment process upon request.



  • Mississauga, Ontario, Canada Devopshunt Full time

    Job OverviewWe are seeking a highly skilled Site Reliability Engineer Lead to join our team at Devopshunt. As a key member of our DevOps team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement scalable and highly available cloud infrastructure...


  • Mississauga, Ontario, Canada Devopshunt Full time

    Job OverviewWe are seeking a highly skilled Site Reliability Engineer Lead to join our team at Devopshunt. As a key member of our DevOps team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement scalable and highly available cloud infrastructure...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a Team Lead, Site Reliability, you will be responsible for guiding our team in enhancing platform stability, reliability, and efficiency.Key Responsibilities:Ensure infrastructure...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a Team Lead, Site Reliability, you will be responsible for guiding our team in enhancing platform stability, reliability, and efficiency.Key Responsibilities:Ensure infrastructure...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our dynamic team at KUBRA Data Transfer Ltd. As a key member of our infrastructure team, you will play a critical role in ensuring the high availability, security, and performance of our cloud-based systems.Key ResponsibilitiesDesign and implement scalable and secure cloud...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our dynamic team at KUBRA Data Transfer Ltd. As a key member of our infrastructure team, you will play a critical role in ensuring the high availability, security, and performance of our cloud-based systems.Key ResponsibilitiesDesign and implement scalable and secure cloud...


  • Mississauga, Ontario, Canada Mimecast Full time

    Senior Site Reliability EngineerContribute to the Development of Advanced Cloud-Scalable AI-Driven Security SolutionsAre you passionate about software security? Do you excel in implementing large-scale public cloud solutions? Are you eager to apply Machine Learning to tackle intricate challenges? This position could be the perfect fit for you. Our...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at KUBRA Data Transfer Ltd. As a key member of our technical operations team, you will play a critical role in ensuring the high availability, security, and performance of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement scalable and secure cloud...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at KUBRA Data Transfer Ltd. As a key member of our technical operations team, you will play a critical role in ensuring the high availability, security, and performance of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement scalable and secure cloud...


  • Mississauga, Ontario, Canada Devopshunt Full time

    Job Title: Site Reliability Engineer Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a Site Reliability Engineer Team Lead, you will be responsible for guiding our team in enhancing platform stability, reliability, and efficiency.Key...


  • Mississauga, Ontario, Canada Devopshunt Full time

    Job Title: Site Reliability Engineer Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a Site Reliability Engineer Team Lead, you will be responsible for guiding our team in enhancing platform stability, reliability, and efficiency.Key...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Ensure infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).Maintain well-documented...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Ensure infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).Maintain well-documented...


  • Mississauga, Ontario, Canada Devopshunt Full time

    Job Title: Site Reliability Engineer Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Identify potential issues and resolve complex problemsLead technical and business discussionsImplement automation and observability to achieve low...


  • Mississauga, Ontario, Canada Devopshunt Full time

    Job Title: Site Reliability Engineer Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Identify potential issues and resolve complex problemsLead technical and business discussionsImplement automation and observability to achieve low...


  • Mississauga, Ontario, Canada Devopshunt Full time

    Job Title: Site Reliability Engineer Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Identify potential issues and resolve complex problems to ensure high-availability, observability, security, and cost efficiency standards.Implement...


  • Mississauga, Ontario, Canada Devopshunt Full time

    Job Title: Site Reliability Engineer Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Identify potential issues and resolve complex problems to ensure high-availability, observability, security, and cost efficiency standards.Implement...


  • Mississauga, Ontario, Canada Devopshunt Full time

    Job Title: Site Reliability Engineer Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Guide the DevOps team in identifying potential issues, resolving complex problems, and leading technical and business discussions.Implement automation and...


  • Mississauga, Ontario, Canada Devopshunt Full time

    Job Title: Site Reliability Engineer Team LeadWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Guide the DevOps team in identifying potential issues, resolving complex problems, and leading technical and business discussions.Implement automation and...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer Lead to join our team at KUBRA Data Transfer Ltd. As a key member of our DevOps team, you will be responsible for ensuring the stability, reliability, and efficiency of our customer experience management platforms.Key ResponsibilitiesInfrastructure Optimization: Ensure that our...