Site Reliability Engineer
4 weeks ago
We are growing at KUBRA, a fast-growing company delivering customer communications solutions to large utility, insurance, and government entities across North America. Our platform handles over 1.5 billion customer interactions annually, reaching over 40% of households in the U.S. and Canada.
About the Role
As a Team Lead, Site Reliability Engineer, you will guide our DevOps team in optimizing our customer experience management platforms. This is a hybrid opportunity in Mississauga, ON.
- Ensure that infrastructure and applications perform within established Service Level Agreements (SLA) and Service Level Objectives (SLO).
- Maintain well-documented standards and best practices to ensure services are built for high availability and security.
- Implement appropriate automation and observability to achieve low and continuously improving mean time to recovery (MTTR) for service-impacting incidents.
- Document any incidents thoroughly, along with corresponding problem records and corrective actions.
- Participate in the Architectural Review Process for new and existing services, ensuring compliance with high-availability, observability, security, and cost efficiency standards.
- Enhance governance processes to ensure all platform components meet current standards.
- Lead root cause analysis for major incidents, communicating with senior stakeholders, driving problem-solving, and debugging using best practice techniques.
- Design and conduct fault injection experiments to identify potential weak points in high-availability architecture and work with engineering teams to remediate findings.
- Collaborate with engineering teams to optimize infrastructure for security, resiliency, and cost targets based on collected feedback.
- Document processes and maintain records related to infrastructure procedures and strategies, ensuring appropriate alerts and support procedures are in place for quick incident remediation.
Requirements
To succeed in this role, you will need:
- Bachelor's degree in Computer Science, Engineering, Information Technology, or equivalent experience.
- 5+ years of experience in site reliability engineering or a related field.
- Proven leadership and team management experience.
- Experience with systems programming languages, such as Go or Python, and shell scripting.
- Proficient with Terraform and infrastructure as code principles.
- Demonstrated proficiency in public cloud environments, particularly AWS.
- Hands-on experience with Kubernetes management within AWS EKS.
- Experience with CI/CD automation tools, such as CircleCI and ArgoCD.
- Experience with monitoring and logging using tools like Prometheus, Grafana, Open Telemetry, CloudWatch, and Honeycomb.
- AWS and Kubernetes Certifications (Solutions Architect, SysOps Administrator, DevOps Engineer, CKA, CKS, CKAD, KCNA) are desirable.
What We Offer
At KUBRA, we offer a competitive salary of $120,000 per year, plus benefits and bonuses. We also provide opportunities for growth and development, including access to LinkedIn learning courses and education reimbursement programs. Our office is small enough to allow creative individuals to flourish, yet large enough to provide long-term stability. We place a tremendous amount of responsibility on our team members to be productive, focused, and self-motivated.
-
Site Reliability Engineering Director
7 days ago
Mississauga, Ontario, Canada KUBRA Full timeKUBRA is a fast-growing company that delivers customer communications solutions to some of the largest utility, insurance, and government entities across North America.As Site Reliability Engineering Director, you will play a key role in ensuring the stability, reliability, and efficiency of our platforms. This is an exciting opportunity to lead a team of...
-
Principal Site Reliability Engineer
1 month ago
Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full timeWe are growing at KUBRA, and we're looking for a skilled Team Lead, Site Reliability Engineer, where you will guide our DevOps team in optimizing our customer experience management platforms.Main ResponsibilitiesImplementing Automation and Observability: Implement appropriate automation and observability to achieve low and continuously improving mean time to...
-
Site Reliability Engineer Team Lead
3 weeks ago
Mississauga, Ontario, Canada Interesting Engineering, Inc. Full timeAbout Interesting Engineering, Inc.Interesting Engineering, Inc. is a dynamic organization that offers cutting-edge solutions for customers. With a strong focus on innovation and stability, we are looking for a skilled Site Reliability Engineer Team Lead to join our team.
-
Site Reliability Engineering Team Lead
4 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeWe are seeking a seasoned Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms.About the RoleThe ideal candidate will have 5+ years of experience in site reliability engineering or a related field, with a strong background in systems programming languages, such as Go or Python, and shell scripting. They...
-
Site Reliability Engineering Team Manager
4 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeJob OverviewWe are seeking an experienced Site Reliability Engineering Team Manager to join our dynamic team at KUBRA. As a leader in customer communications solutions, we deliver high-quality services to some of the largest utility, insurance, and government entities across North America.
-
Site Reliability Engineering Team Lead
2 days ago
Mississauga, Ontario, Canada KUBRA Full timeWe are growing at KUBRA, and we're seeking a skilled Site Reliability Engineer to lead our DevOps team in optimizing our customer experience management platforms. The ideal candidate will have a passion for enhancing platform stability, reliability, and efficiency.Job SummaryThe Site Reliability Engineering Team Lead will play a pivotal role in identifying...
-
Site Reliability Engineering Team Lead
7 days ago
Mississauga, Ontario, Canada KUBRA Full timeKUBRA: A Leader in Customer Experience ManagementAre you a seasoned Site Reliability Engineer looking to take on a leadership role? Do you have a passion for enhancing platform stability, reliability, and efficiency?We are growing at KUBRA, a company that specializes in billing and payments, mapping, mobile apps, proactive communications, and artificial...
-
Site Reliability Engineer Team Lead
1 week ago
Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full timeAt KUBRA Data Transfer Ltd, we're seeking a highly skilled Team Lead, Site Reliability Engineer to join our DevOps team.We're growing rapidly, and this role will play a crucial part in optimizing our customer experience management platforms.About the RoleWe're looking for someone with experience in implementing automation and observability to achieve low and...
-
Site Reliability Engineering Manager
5 days ago
Mississauga, Ontario, Canada KUBRA Full timeAbout KUBRAKUBRA is a fast-growing company that delivers customer communications solutions to some of the largest utility, insurance, and government entities across North America.We offer billing and payments, mapping, mobile apps, proactive communications, and artificial intelligence solutions for customers. With more than 1.5 billion customer interactions...
-
Reliability Engineering Specialist
1 week ago
Mississauga, Ontario, Canada Thermo Fisher Scientific Full timeAbout the RoleWe are seeking a highly skilled Reliability Engineering Specialist to join our team at Thermo Fisher Scientific. This is a full-time position that offers a competitive salary and benefits package.Job DescriptionThe main focus of this position is to provide support for the Engineering department, with a primary emphasis on developing and...
-
Reliability Engineering Manager
4 weeks ago
Mississauga, Ontario, Canada KUBRA Data Transfer Ltd Full timeWe are seeking a highly skilled Reliability Engineering Manager to join our team at KUBRA Data Transfer Ltd. in Mississauga, ON.This is a hybrid opportunity that offers a unique blend of technical and leadership challenges. As a Reliability Engineering Manager, you will play a critical role in ensuring the high availability and security of our customer...
-
Site Reliability Engineering Team Lead
7 days ago
Mississauga, Ontario, Canada KUBRA Full timeAbout the Role: We are looking for a skilled Site Reliability Engineer to lead our DevOps team in optimizing customer experience management platforms.Description: As a Site Reliability Engineering Team Lead, you will be responsible for guiding our DevOps team in enhancing platform stability, reliability, and efficiency. Your technical expertise will be...
-
Reliability Maintenance Engineer Assistant
4 weeks ago
Mississauga, Ontario, Canada Thermo Fisher Scientific Inc. Full timeJob DescriptionWe are seeking a highly motivated and detail-oriented Reliability Maintenance Engineer Assistant to join our team at Thermo Fisher Scientific Inc. This role is an exciting opportunity to support the Engineering department in developing and implementing processes and procedures necessary to establish a Reliability Centered Maintenance (RCM)...
-
Platform Reliability Engineering Team Lead
3 weeks ago
Mississauga, Ontario, Canada KUBRA Full timeWe are growing at KUBRA, a leading provider of customer communications solutions to some of the largest utility, insurance, and government entities across North America. Our team is seeking an experienced Site Reliability Engineer with a passion for enhancing platform stability, reliability, and efficiency.Job DescriptionAs a Platform Reliability Engineering...
-
Reliability Systems Engineer
3 weeks ago
Mississauga, Ontario, Canada HF Sinclair Full timeHF Sinclair is seeking a skilled Reliability Systems Engineer to join our team in Alberta, Canada. This is an exciting opportunity for a motivated individual with a strong background in mechanical engineering and reliability analysis.About the RoleThe successful candidate will be responsible for leading reliability improvement activities, stewarding...
-
Reliability Engineering Team Lead
1 week ago
Mississauga, Ontario, Canada Petro Papa Full timeAbout This OpportunityPetro Canada Lubricants Inc, a leading provider of high-quality fuels and lubricants, is seeking an experienced Reliability Engineering Team Lead to join our team in Mississauga, ON.Job SummaryThis leadership role will oversee the direction and execution of engineering requirements for maintenance activities, ensuring equipment...
-
Engineering Manager
4 weeks ago
Mississauga, Ontario, Canada HF Sinclair Full timeAbout the RoleWe are seeking an experienced Engineering Manager to lead our Maintenance and Reliability team at HF Sinclair. This is a key position that requires strong leadership skills, technical expertise, and a passion for delivering results.Job DescriptionThe successful candidate will be responsible for providing strategic direction and oversight to our...
-
Reliability Engineer for Asset Maintenance
6 days ago
Mississauga, Ontario, Canada HF Sinclair Full timeJob OverviewWe are seeking a skilled Reliability Engineer to join our team as an Asset Maintenance Planner. This is a challenging role that requires strong planning, analytical, and communication skills.ResponsibilitiesDevelop comprehensive plans for maintenance work requests, taking into account priority, risk rank, and equipment reliability...
-
Automation and Service Reliability Expert
1 week ago
Mississauga, Ontario, Canada KUBRA Full timeJob Description:KUBRA is seeking a seasoned Site Reliability Engineer to spearhead the optimization of our customer experience management platforms. This role requires expertise in IT Service Delivery and Management, with a passion for enhancing platform stability, reliability, and efficiency.Key Responsibilities:Identify potential issues and resolve complex...
-
Reliability Engineer
1 week ago
Mississauga, Ontario, Canada AtkinsRéalis Full timeAtkinsRéalis is one of Canada's largest private sector nuclear engineering groups, providing a wide range of services to the nuclear industry for over 60 years.The department of Probabilistic Safety Analysis at AtkinsRéalis is engaged in probabilistic safety analysis (PSA) in support of existing and new CANDU Nuclear Power Plants both domestic and...