Site Reliability Engineer Lead
5 days ago
We are seeking a highly skilled Site Reliability Engineer Lead to join our team at KUBRA Data Transfer Ltd. As a key member of our DevOps team, you will be responsible for ensuring the stability, reliability, and efficiency of our customer experience management platforms.
Key Responsibilities- Infrastructure Optimization: Ensure that our infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).
- Standardization and Best Practices: Maintain well-documented standards and best practices to ensure services are built for high availability and security.
- Automation and Observability: Implement appropriate automation and observability to achieve low and continuously improving mean time to recovery (MTTR) for service-impacting incidents.
- Incident Management: Document any incidents thoroughly, along with corresponding problem records and corrective actions.
- Architectural Review: Participate in the Architectural Review Process for new and existing services, ensuring compliance with high-availability, observability, security, and cost efficiency standards.
- Governance and Compliance: Enhance governance processes to ensure all platform components meet current standards.
- Root Cause Analysis: Lead root cause analysis for major incidents, communicating with senior stakeholders, driving problem-solving, and debugging using best practice techniques.
- Experimentation and Remediation: Design and conduct fault injection experiments to identify potential weak points in high-availability architecture and work with engineering teams to remediate findings.
- Collaboration and Documentation: Collaborate with engineering teams to optimize infrastructure for security, resiliency, and cost targets based on collected feedback. Document processes and maintain records related to infrastructure procedures and strategies, ensuring appropriate alerts and support procedures are in place for quick incident remediation.
- Technical Expertise: Adept at solving complex technical challenges and devising effective solutions.
- Attention to Detail: Meticulous attention to detail to ensure high standards of availability and security.
- Teamwork and Communication: Team player with strong interpersonal skills, able to work well within a team setting. Effective communicator, capable of explaining complex technical issues to both technical and non-technical audiences.
- Leadership and Management: Proven leadership and team management skills.
- Education and Experience: Bachelor's degree in Computer Science, Engineering, Information Technology, or equivalent experience. 5+ years of experience in site reliability engineering or a related field.
- Technical Skills: Experience with systems programming languages, such as Go or Python, and shell scripting. Proficient with Terraform and infrastructure as code principles. Demonstrated proficiency in public cloud environments, particularly AWS. Hands-on experience with Kubernetes management within AWS EKS. Experience with CI/CD automation tools, such as CircleCI and ArgoCD. Experience with monitoring and logging using tools like Prometheus, Grafana, Open Telemetry, CloudWatch, and Honeycomb.
KUBRA Data Transfer Ltd is a fast-growing company that delivers customer communications solutions to some of the largest utility, insurance, and government entities across North America. We offer a casual work environment, competitive compensation, and a stellar benefits program.
-
Site Reliability Engineer Lead
4 days ago
Mississauga, Ontario, Canada Devopshunt Full timeJob OverviewWe are seeking a highly skilled Site Reliability Engineer Lead to join our team at Devopshunt. As a key member of our DevOps team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement scalable and highly available cloud infrastructure...
-
Site Reliability Engineer Lead
4 days ago
Mississauga, Ontario, Canada Devopshunt Full timeJob OverviewWe are seeking a highly skilled Site Reliability Engineer Lead to join our team at Devopshunt. As a key member of our DevOps team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement scalable and highly available cloud infrastructure...
-
Site Reliability Engineering Team Lead
4 days ago
Mississauga, Ontario, Canada Devopshunt Full timeAbout the RoleWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a Site Reliability Engineering Team Lead, you will be responsible for guiding our team in identifying potential issues, resolving complex problems, and leading technical and business discussions.Key...
-
Site Reliability Engineering Team Lead
4 days ago
Mississauga, Ontario, Canada Devopshunt Full timeAbout the RoleWe are seeking an experienced Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. As a Site Reliability Engineering Team Lead, you will be responsible for guiding our team in identifying potential issues, resolving complex problems, and leading technical and business discussions.Key...
-
**Site Reliability Engineer Team Lead**
7 days ago
Mississauga, Ontario, Canada KUBRA Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer Team Lead to join our team at KUBRA. As a key member of our DevOps team, you will be responsible for guiding our team in optimizing our customer experience management platforms.Key ResponsibilitiesEnsure that infrastructure and applications perform within established Service Level...
-
**Site Reliability Engineer Team Lead**
5 days ago
Mississauga, Ontario, Canada KUBRA Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer Team Lead to join our team at KUBRA. As a key member of our DevOps team, you will be responsible for guiding our team in optimizing our customer experience management platforms.Key ResponsibilitiesEnsure that infrastructure and applications perform within established Service Level...
-
**Site Reliability Engineer Team Lead**
5 days ago
Mississauga, Ontario, Canada KUBRA Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer Team Lead to join our team at KUBRA. As a key member of our DevOps team, you will be responsible for guiding our team in optimizing our customer experience management platforms.Key ResponsibilitiesEnsure that infrastructure and applications perform within established Service Level...
-
Lead Site Reliability Engineer
4 weeks ago
Mississauga, Ontario, Canada Mimecast Full timeSenior Site Reliability EngineerContribute to the Development of Advanced Cloud-Enabled AI Security SolutionsAre you enthusiastic about software security? Do you excel in implementing large-scale public cloud solutions? Are you eager to apply Machine Learning to tackle intricate challenges? This position could be the perfect fit for you. Our Communication...
-
Lead Site Reliability Engineer
4 weeks ago
Mississauga, Ontario, Canada Mimecast Full timeSenior Site Reliability EngineerContribute to the Development of Advanced Cloud-Scalable AI-Driven Security SolutionsAre you passionate about software security? Do you excel in implementing public cloud solutions at scale? Are you eager to utilize Machine Learning to tackle intricate challenges? This position may be an excellent fit for you. Our...
-
Lead Site Reliability Engineer
4 weeks ago
Mississauga, Ontario, Canada HOVER SENIOR LIVING COMMUNITY Full timeSenior Site Reliability EngineerContribute to the Development of Advanced Cloud-Scalable Security SolutionsAre you passionate about software security? Do you excel in implementing scalable public cloud solutions? Are you eager to apply Machine Learning to tackle intricate challenges? This position may be the perfect fit for you. Our innovative security...
-
Lead Site Reliability Engineer
4 weeks ago
Mississauga, Ontario, Canada Mimecast Full timeSenior Site Reliability EngineerContribute to the Development of Advanced Cloud-Scalable AI-Driven Security SolutionsAre you passionate about software security? Do you excel in implementing large-scale public cloud solutions? Are you eager to apply Machine Learning to tackle intricate challenges? This position could be the perfect fit for you. Our...
-
Lead Site Reliability Engineer
4 weeks ago
Mississauga, Ontario, Canada HOVER SENIOR LIVING COMMUNITY Full timeSenior Site Reliability EngineerContribute to the Development of Advanced Cloud-Scalable AI-Driven Security SolutionsAre you passionate about software security? Do you excel in implementing large-scale public cloud solutions? Are you eager to utilize Machine Learning to tackle intricate challenges? This position could be an excellent fit for you. Our...
-
Lead Site Reliability Engineer
4 weeks ago
Mississauga, Ontario, Canada HOVER SENIOR LIVING COMMUNITY Full timeSenior Site Reliability EngineerContribute to Innovative Cloud-Driven Security SolutionsAre you passionate about software security? Do you excel in implementing scalable public cloud solutions? Are you eager to apply Machine Learning to tackle intricate challenges? This position may be the perfect fit for you. Our advanced Communication and Collaboration...
-
Site Reliability Engineer
1 week ago
Mississauga, Ontario, Canada KUBRA Full timeAre you passionate about transforming and optimizing complex infrastructures? Do you thrive on solving challenging technical problems and ensuring high availability, security, and performance in cloud environments?At KUBRA, we're seeking an enthusiastic and skilled Site Reliability Engineer to join our dynamic team. You'll work with cutting-edge technologies...
-
Manager of Site Asset Reliability
4 weeks ago
Mississauga, Ontario, Canada Maple Leaf Foods Full timeThe Role: In this pivotal position, you will report directly to the Head of Asset Management and Reliability, becoming an integral part of the Asset Reliability Group (ARG) at Maple Leaf Foods. The ARG is dedicated to enhancing manufacturing reliability and operational maturity across the Maple Leaf network of facilities in North America. Your...
-
Senior Site Reliability Engineer
5 days ago
Mississauga, Ontario, Canada Roche Full timeAbout the RoleSenior Site Reliability Engineer (Kubernetes Platform) - Digital Products and EnablementAt Roche, we're revolutionizing healthcare by developing personalized medicine and advanced diagnostics. To accelerate medical processes, make them safer, and more accessible, we're heavily investing in software and digital solutions.The team you'll be...
-
Senior Site Reliability Engineer
7 days ago
Mississauga, Ontario, Canada Roche Full timeAbout the RoleSenior Site Reliability Engineer (Kubernetes Platform) - Digital Products and EnablementAt Roche, we're revolutionizing healthcare by developing personalized medicine and advanced diagnostics. To accelerate medical processes, make them safer, and more accessible, we're heavily investing in software and digital solutions.The team you'll be...
-
Manager of Site Asset Reliability
4 weeks ago
Mississauga, Ontario, Canada Maple Leaf Foods Full timePosition Overview: This role is integral to the Asset Reliability Group (ARG) at Maple Leaf Foods, reporting directly to the Head of Asset Management and Reliability. Key Responsibilities: Lead and oversee the implementation and sustainability of asset reliability initiatives at the plant level.Collaborate with cross-functional teams to enhance plant...
-
Site Reliability Engineer
6 days ago
Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our dynamic team at KUBRA Data Transfer Ltd. As a key member of our infrastructure team, you will be responsible for ensuring the high availability, security, and performance of our cloud-based systems.Key ResponsibilitiesInfrastructure Optimization: Ensure that our...
-
Site Reliability Engineer
7 days ago
Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our dynamic team at KUBRA Data Transfer Ltd. As a key member of our infrastructure team, you will be responsible for ensuring the high availability, security, and performance of our cloud-based systems.Key ResponsibilitiesInfrastructure Optimization: Ensure that our...