Reliability Engineer

2 months ago


Old Toronto, Canada Chelsea Avondale Full time

Chelsea Avondale is the world’s most cutting-edge home insurance group. We have developed sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company.

Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our group includes our scientific research & engineering division (Skynet Software) and Canadian property & casualty insurance company (Max Insurance).

Together, our group is transforming the Canadian and global insurance landscape.

JOB DESCRIPTION:

Chelsea Avondale is looking for a Reliability Engineer with a background in infrastructure system engineering to support the growth of a secure, dynamic, and scalable IT environment across the group. Our business is going through rapid growth, and it is essential that our systems infrastructure keeps pace.

The Reliability Engineer will play a crucial role in ensuring the reliability, scalability, and performance of our systems, enabling the continuous delivery of our products and services. They will be accountable for ensuring overall availability, as well as enhancing Engineering teams’ capability to design, build and operate robust systems at scale.

This position is ideal for candidates who have an extraordinary sense of responsibility and are not afraid to roll up their sleeves. Our IT environment is not toolkit rich. What we are NOT looking for is someone who wants to take months installing a large number of tools from their preferred toolkit. We take pride in maintaining a fundamental stack of technologies, much of it in Python, and we are looking for someone who shares this mentality.

If you are someone who thrives in a high-performance culture and is eager for work that is both challenging and constantly evolving, this role is perfect for you. We strongly encourage and help our team members to improve and enhance their personal skill sets within our organization. On your journey with us, you will have the ability to learn and grow rapidly, taking on more responsibilities.

RESPONSIBILITIES:

  • Play an integral role in the design, implementation & maintenance of AWS cloud server environments.
  • Design, implement, and maintain robust monitoring and alerting systems in Python to detect and respond to incidents in a timely manner.
  • Collaborate with cross-functional teams to enhance reliability of our systems and services.
  • Design, configure, deploy, and maintain infrastructure on AWS using best practices and industry standards.
  • Conduct post-incident analysis to identify root causes, implement corrective actions, and prevent similar issues in the future.
  • Assist in capacity planning & optimize services to provide scalable, stable, & secure systems.
  • Implement high availability and disaster recovery solutions to provide data redundancy, resilience, and data loss prevention.
  • Assist with the implementation of select network engineering solutions including firewalls, load balancing, VPNs & LANs, where necessary.

PREFERRED EXPERIENCE & SKILLS:

  • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or related field.
  • 1+ years of experience as a Reliability Engineer or similar role, with a focus on maintaining high-performance, scalable, and reliable web systems.
  • We also encourage highly motivated new grads to apply.
  • Hands-on experience with AWS cloud environments – instances, CloudWatch, EFS, etc.
  • Proficiency at Python is a must.
  • Experience using NGINX for reverse proxy, load balancing, and caching.
  • Experience with Unix / Windows server configuration, administration, performance tuning and troubleshooting.
  • Working knowledge of web technologies (web servers, DNS, SSL, Browsers).
  • Working knowledge of web development processes (source control, deployment, etc.).
  • Experience load testing, pen testing, and providing security for cloud resources is beneficial.
#J-18808-Ljbffr
  • Reliability Engineer

    2 months ago


    Old Toronto, Ontario, CA Chelsea Avondale Full time

    Chelsea Avondale is the world’s most cutting-edge home insurance group. We have developed sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company. Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our group...


  • Old Toronto, Canada Reperio Human Capital Full time

    ```html Site Reliability Engineer 100421 Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience with monitoring and automation tools. Responsibilities: Ensure the reliability,...


  • Old Toronto, Canada Reperio Human Capital Full time

    ```html Site Reliability Engineer 100421 Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience with monitoring and automation tools. Responsibilities: Ensure the reliability,...


  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and Confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and Confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (9 months 4 days) Published 3 days ago New Relic Data Dog Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (9 months 4 days) Published 3 days ago New Relic Data Dog Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will...


  • Old Toronto, Canada TD Bank Full time

    Site Reliability EngineerSite Reliability EngineerWork Location: CanadaHours: 37.5Line of Business: Technology SolutionsPay Details: We’re committed to providing fair and equitable compensation to all our colleagues. As a candidate, we encourage you to have an open dialogue with a member of our HR Team and ask compensation related questions, including pay...


  • Old Toronto, Canada TD Bank Full time

    Site Reliability EngineerSite Reliability EngineerWork Location: CanadaHours: 37.5Line of Business: Technology SolutionsPay Details: We’re committed to providing fair and equitable compensation to all our colleagues. As a candidate, we encourage you to have an open dialogue with a member of our HR Team and ask compensation related questions, including pay...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (5 months 29 days) Published 8 months ago CLOSED GCP Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will analyze...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (5 months 29 days) Published 8 months ago CLOSED GCP Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will analyze...


  • Old Toronto, Ontario, CA Reperio Human Capital Full time

    ```html Site Reliability Engineer 100421 Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience with monitoring and automation tools. Responsibilities: Ensure the reliability,...


  • Old Toronto, Canada eTeam Full time

    Remote Work Duration 4 months - Preference is to find candidates who are willing to be converted to full-time employees. The conversion decision will be made based on performance. Job Description Role Description: Defining and measuring reliability goals—SLIs, SLOs, and error budgets for user journey. Designing for and implementing observability (ELK,...


  • Old Toronto, Canada eTeam Full time

    Remote Work Duration 4 months - Preference is to find candidates who are willing to be converted to full-time employees. The conversion decision will be made based on performance. Job Description Role Description: Defining and measuring reliability goals—SLIs, SLOs, and error budgets for user journey. Designing for and implementing observability (ELK,...


  • Toronto, Canada Kinross Gold Corporation Full time

    Start Date ASAPHybrid Work Environment (3 days in office, 2 days remote with flexible hours)Dress Code Business CasualLocation Downtown Toronto, Outside of Union Station (TTC & GO accessible)A Great Place to Work Who We Are Kinross is a Canadian-based global senior gold mining company with operations and projects in the United States, Brazil, Mauritania,...


  • Toronto, Canada Kinross Gold Corporation Full time

    Start Date ASAP Hybrid Work Environment (3 days in office, 2 days remote with flexible hours) Dress Code Business Casual Location Downtown Toronto, Outside of Union Station (TTC & GO accessible) A Great Place to Work Who We Are Kinross is a Canadian-based global senior gold mining company with operations and projects in the United States, Brazil,...


  • Old Toronto, Ontario, CA CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and Confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Ontario, Canada BMO Financial Group Full time

    About the RoleWe are seeking a highly skilled and experienced Director - Platform Reliability and Engineering to lead our team in delivering high-quality, reliable, and efficient systems and platforms.Key ResponsibilitiesLead a team of engineers to design, develop, and implement reliable systems and platforms using Site Reliability Engineering (SRE)...


  • Old Toronto, Canada The Voleon Group Full time

    Voleon is a technology company that applies state-of-the-art machine learning techniques to real-world problems in finance. For more than 15 years, we have led our industry and worked at the frontier of applying machine learning to investment management. We have become a multi-billion-dollar asset manager, and we have ambitious goals for the future.Your...


  • Old Toronto, Canada The Voleon Group Full time

    Voleon is a technology company that applies state-of-the-art machine learning techniques to real-world problems in finance. For more than 15 years, we have led our industry and worked at the frontier of applying machine learning to investment management. We have become a multi-billion-dollar asset manager, and we have ambitious goals for the future.Your...