Reliability Engineer

3 weeks ago


Toronto ON, Canada Chelsea Avondale Full time

Chelsea Avondale is the world’s most cutting-edge home insurance group. We have developed the most sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company.

Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our group includes our scientific research & engineering division (Skynet Software) and Canadian property & casualty insurance company (Max Insurance).

Together, our group is transforming the Canadian and global insurance landscape.

JOB DESCRIPTION:

Chelsea Avondale is looking for a Reliability Engineer with a background in infrastructure system engineering to support the growth of a secure, dynamic, and scalable IT environment across the group. Our business is going through rapid growth, and it is essential that our systems infrastructure keeps pace.

The Reliability Engineer will play a crucial role in ensuring the reliability, scalability, and performance of our systems, enabling the continuous delivery of our products and services. They will be accountable for ensuring overall availability, as well as enhancing Engineering teams’ capability to design, build and operate robust systems at scale.

This position is ideal for candidates who have an extraordinary sense of responsibility and are not afraid to roll up their sleeves. Our IT environment is not toolkit rich. What we are NOT looking for is someone who wants to take months installing a large number of tools from their preferred toolkit. We take pride in maintaining a fundamental stack of technologies, much of it in Python, and we are looking for someone who shares this mentality. We are looking for someone who thrives in a high-performance culture and is eager for work that is both challenging and constantly evolving.

RESPONSIBILITIES:

  • Play an integral role in the design, implementation & maintenance of AWS cloud server environments.
  • Design, implement, and maintain robust monitoring and alerting systems in Python to detect and respond to incidents in a timely manner.
  • Collaborate with cross-functional teams to enhance reliability of our systems and services.
  • Design, configure, deploy, and maintain infrastructure on AWS using best practices and industry standards.
  • Conduct post-incident analysis to identify root causes, implement corrective actions, and prevent similar issues in the future.
  • Assist in capacity planning & optimize services to provide scalable, stable, & secure systems.
  • Implement high availability and disaster recovery solutions to provide data redundancy, resilience, and data loss prevention.
  • Assist with the implementation of select network engineering solutions including firewalls, load balancing, VPNs & LANs, where necessary.

PREFERRED EXPERIENCE & SKILLS:

  • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or related field.
  • 5+ years of experience as a Reliability Engineer or similar role, with a focus on maintaining high-performance, scalable, and reliable web systems.
  • Hands-on experience with AWS cloud environments – instances, CloudWatch, EFS, etc.
  • Profiency at Python is a must.
  • Experience using NGINX for reverse proxy, load balancing, and caching.
  • Experience with Unix / Windows server configuration, administration, performance tuning and troubleshooting.
  • Working knowledge of web technologies (web servers, DNS, SSL, Browsers).
  • Working knowledge of web development processes (source control, deployment, etc).
  • Experience load testing, pen testing, and providing security for cloud resources is beneficial.
#J-18808-Ljbffr

  • Toronto, ON, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Toronto, ON, Canada Nityo Infotech Full time

    Job Responsibilities: Objectives of this Role Run the IKP clusters by monitoring availability and taking a holistic view of system health Build tools and automation to manage platform infrastructure and services Improve reliability, quality, and time to upgrade cluster and service versions Measure and optimize system performance and resource...


  • Toronto, ON, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (9 months 4 days) Published 3 days ago New Relic Data Dog Site Reliability Engineer - in the Service Management Organization Do you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure? The Site Reliability Engineer will analyze...

  • Reliability Engineer

    4 weeks ago


    Toronto, Canada Tata Consultancy Services Full time

    About TCS:TCS operates on a global scale, with a diverse talent base of more than 600,000 associates representing 153 nationalities across 55 countries. TCS has been recognized as a Global Top Employer by the Top Employers Institute - one of only eight companies worldwide to have achieved this status. Our organizational structure is domain-led and designed...

  • Reliability Engineer

    4 weeks ago


    Toronto, Canada Tata Consultancy Services Full time

    About TCS:TCS operates on a global scale, with a diverse talent base of more than 600,000 associates representing 153 nationalities across 55 countries. TCS has been recognized as a Global Top Employer by the Top Employers Institute - one of only eight companies worldwide to have achieved this status. Our organizational structure is domain-led and designed...

  • Reliability Engineer

    4 weeks ago


    Toronto, Canada Tata Consultancy Services Full time

    About TCS: TCS operates on a global scale, with a diverse talent base of more than 600,000 associates representing 153 nationalities across 55 countries. TCS has been recognized as a Global Top Employer by the Top Employers Institute - one of only eight companies worldwide to have achieved this status. Our organizational structure is domain-led and designed...

  • Reliability Engineer

    2 weeks ago


    Toronto, Ontario, Canada CSG Talent Full time

    Join a Leading Mining Company in Canada as a Reliability Engineer. This is the best opportunity to grow your career in the maintenance department with a large mining company with its global assets.This is residential role and it comes with very attractive salary and a great relocation and living allowances. Description:Make a significant impact by minimizing...

  • Reliability Engineer

    3 weeks ago


    Old Toronto, Canada Chelsea Avondale Full time

    Chelsea Avondale is the world’s most cutting-edge home insurance group. We have developed the most sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company. Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our...


  • Toronto, ON, Canada eTeam Full time

    Remote Work Duration 4 months - Preference is to find candidates who are willing to be converted to full-time employees. The conversion decision will be made based on performance. Job Description Role Description: Defining and measuring reliability goals—SLIs, SLOs, and error budgets for user journey. Designing for and implementing observability (ELK,...


  • Toronto, ON, Canada eTeam Full time

    Remote work Duration - 4 months - Preference is to find candidates who are willing to be converted to full time employee. The conversion decision will be made based on performance. Job Description: Role Desc: Defining and measuring reliability goals—SLIs, SLOs, and error budgets for user journey Designing for and implementing observability: (ELK,...


  • Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Toronto, ON, Canada Lightspeed Full time

    Hi there! Thanks for stopping by. Are you actively looking for a new opportunity? Or just checking the market? Well… you might just be in the right place! We’re looking for a Principal Site Reliability Engineer to join our NuOrder by Lightspeed team in North America. NuORDER by Lightspeed builds software solutions that help merchants grow the size and...


  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Toronto, ON, Canada ClickHouse Full time

    We are committed to providing our customers with reliable and secure services so we are building out our newly formed Site Reliability Engineering team. As one of the first joiners to our Reliability Engineering Team at ClickHouse, you will be responsible for building and leading processes to ensure the reliability, availability, scalability, and...


  • Toronto, ON, Canada Akamai Full time

    Do you have a passion for cutting edge technologies and tackling system problems? Join our Site Reliability team. Our Team builds and delivers highly secure network security frameworks to protect our customers. We collaborate to create next-generation initiatives supporting automation, deployment, and monitoring of 3rd party cloud infrastructure. Help us...


  • Bath, ON, Canada Randstad Canada Full time

    One of our top partners in the Electric Vehicle space is preparing to build a facility that will be a crucial part of Canada’s Electric Vehicle (EV) supply chain in Kingston area. Our client is looking for a Reliability Engineer who will be responsible for identifying the maintenance requirements for assets to increase plant productivity, improve...


  • Toronto, Ontario, Canada Zortech Solutions Full time

    Hi,Hope you are doing GreatThis side Priya Rajput from Zortech Solutions trying to reach you for an exciting job opening, kindly have a look to job description and revert me with your positive feedback. My mail ID is or call me on .Role: Site Reliability EngineerLocation: Toronto, ON-OnsiteDuration: Fulltime PermanentSkills and Responsibilities:...


  • Toronto, ON, Canada Akamai Full time

    Site Reliability Engineer II Do you have a passion for cutting edge technologies and tackling system problems? Are you a self-starting professional who thrives in a dynamic environment? Join our Site Reliability team. Our Team builds and delivers highly secure network security frameworks to protect our customers. We collaborate to create next-generation...


  • Toronto, ON, Canada emagine Consulting Full time

    Work Model: Remote Business Trips: Occasional to Copenhagen Assignment Type: B2B Project Length: Long-term Start Date: ASAP Project Language: English About the Role: A unique opportunity to join as a Site Reliability Engineer to the dynamic, ambitious, and international company where you will work with a lot of skilled colleagues. You will join...


  • Ajax, ON, Canada Gradient IT Full time

    We are looking for a passionate Site Reliability Engineer with a deep-rooted foundation in DevSecOps and Open Source Technology. The engineer should be passionate about automation and building highly scalable and available services in the cloud. You will help lead a team of engineers to build tooling, automation, and support Spinnaker on behalf of our...