Site Reliability Engineer Lead

2 months ago


Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time
About the Role

We are seeking a highly skilled Site Reliability Engineer Lead to join our team at KUBRA Data Transfer Ltd. As a key member of our DevOps team, you will be responsible for ensuring the stability, reliability, and efficiency of our customer experience management platforms.

Key Responsibilities
  • Infrastructure Optimization: Ensure that our infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).
  • Standardization and Best Practices: Maintain well-documented standards and best practices to ensure services are built for high availability and security.
  • Automation and Observability: Implement appropriate automation and observability to achieve low and continuously improving mean time to recovery (MTTR) for service-impacting incidents.
  • Incident Management: Document any incidents thoroughly, along with corresponding problem records and corrective actions.
  • Architectural Review: Participate in the Architectural Review Process for new and existing services, ensuring compliance with high-availability, observability, security, and cost efficiency standards.
  • Governance and Compliance: Enhance governance processes to ensure all platform components meet current standards.
  • Root Cause Analysis: Lead root cause analysis for major incidents, communicating with senior stakeholders, driving problem-solving, and debugging using best practice techniques.
  • Experimentation and Remediation: Design and conduct fault injection experiments to identify potential weak points in high-availability architecture and work with engineering teams to remediate findings.
  • Collaboration and Documentation: Collaborate with engineering teams to optimize infrastructure for security, resiliency, and cost targets based on collected feedback. Document processes and maintain records related to infrastructure procedures and strategies, ensuring appropriate alerts and support procedures are in place for quick incident remediation.
Requirements
  • Technical Expertise: Adept at solving complex technical challenges and devising effective solutions.
  • Attention to Detail: Meticulous attention to detail to ensure high standards of availability and security.
  • Teamwork and Communication: Team player with strong interpersonal skills, able to work well within a team setting. Effective communicator, capable of explaining complex technical issues to both technical and non-technical audiences.
  • Leadership and Management: Proven leadership and team management skills.
  • Education and Experience: Bachelor's degree in Computer Science, Engineering, Information Technology, or equivalent experience. 5+ years of experience in site reliability engineering or a related field.
  • Technical Skills: Experience with systems programming languages, such as Go or Python, and shell scripting. Proficient with Terraform and infrastructure as code principles. Demonstrated proficiency in public cloud environments, particularly AWS. Hands-on experience with Kubernetes management within AWS EKS. Experience with CI/CD automation tools, such as CircleCI and ArgoCD. Experience with monitoring and logging using tools like Prometheus, Grafana, Open Telemetry, CloudWatch, and Honeycomb.
About KUBRA Data Transfer Ltd

KUBRA Data Transfer Ltd is a fast-growing company that delivers customer communications solutions to some of the largest utility, insurance, and government entities across North America. We offer a casual work environment, competitive compensation, and a stellar benefits program.



  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a Team Lead, Site Reliability, you will be responsible for guiding our team in enhancing platform stability, reliability, and efficiency.Key Responsibilities:Ensure infrastructure...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a Team Lead, Site Reliability, you will be responsible for guiding our team in enhancing platform stability, reliability, and efficiency.Key Responsibilities:Ensure infrastructure...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Ensure infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).Maintain well-documented...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Ensure infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).Maintain well-documented...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Ensure infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).Maintain well-documented...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Ensure infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).Maintain well-documented...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Senior Site Reliability Engineer - Technical LeadWe are seeking an experienced Senior Site Reliability Engineer to join our team as a Technical Lead. As a key member of our DevOps team, you will be responsible for guiding our platform stability, reliability, and efficiency initiatives.About the Role:Lead the development and implementation of...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Senior Site Reliability Engineer - Technical LeadWe are seeking an experienced Senior Site Reliability Engineer to join our team as a Technical Lead. As a key member of our DevOps team, you will be responsible for guiding our platform stability, reliability, and efficiency initiatives.About the Role:Lead the development and implementation of...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to join our team at KUBRA Data Transfer Ltd. As a Team Lead, Site Reliability, you will be responsible for guiding our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Ensure that infrastructure and applications perform...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityWe are seeking an experienced Site Reliability Engineer to join our team at KUBRA Data Transfer Ltd. As a Team Lead, Site Reliability, you will be responsible for guiding our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Ensure that infrastructure and applications perform...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Senior Site Reliability Engineer - Technical LeadWe are seeking an experienced Senior Site Reliability Engineer to join our team as a Technical Lead. As a key member of our DevOps team, you will be responsible for guiding our efforts in optimizing our customer experience management platforms.Your technical expertise will be crucial in identifying...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Senior Site Reliability Engineer - Technical LeadWe are seeking an experienced Senior Site Reliability Engineer to join our team as a Technical Lead. As a key member of our DevOps team, you will be responsible for guiding our efforts in optimizing our customer experience management platforms.Your technical expertise will be crucial in identifying...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    We are seeking a seasoned Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a key member of our team, you will be responsible for implementing automation and observability to achieve low and continuously improving mean time to recovery (MTTR) for service-impacting incidents.Key responsibilities...


  • Mississauga, Ontario, Canada MSI Reproductive Choices Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Symcor. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining scalable, resilient, and secure OpenShift solutions.Key ResponsibilitiesInstall, configure, and maintain OpenShift clustersImprove OpenShift infrastructure...


  • Mississauga, Ontario, Canada MSI Reproductive Choices Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Symcor. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining scalable, resilient, and secure OpenShift solutions.Key ResponsibilitiesInstall, configure, and maintain OpenShift clustersImprove OpenShift infrastructure...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    Job Title: Team Lead, Site ReliabilityAre you a seasoned Site Reliability Engineer with a passion for enhancing platform stability, reliability, and efficiency? We're seeking a skilled Team Lead, Site Reliability Engineer to guide our DevOps team in optimizing our customer experience management platforms.Key Responsibilities:Ensure infrastructure and...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    We are growing at KUBRA, and we're looking for a skilled Team Lead, Site Reliability Engineer, where you will guide our DevOps team in optimizing our customer experience management platforms.Main ResponsibilitiesImplementing Automation and Observability: Implement appropriate automation and observability to achieve low and continuously improving mean time to...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our dynamic team at KUBRA Data Transfer Ltd. As a key member of our infrastructure team, you will play a critical role in ensuring the high availability, security, and performance of our cloud-based systems.Key ResponsibilitiesDesign and implement scalable and secure cloud...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our dynamic team at KUBRA Data Transfer Ltd. As a key member of our infrastructure team, you will play a critical role in ensuring the high availability, security, and performance of our cloud-based systems.Key ResponsibilitiesDesign and implement scalable and secure cloud...


  • Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at KUBRA Data Transfer Ltd. As a key member of our technical operations team, you will play a critical role in ensuring the high availability, security, and performance of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement scalable and secure cloud...