Reliability Engineer

1 month ago


Old Toronto, Canada Chelsea Avondale Full time

Chelsea Avondale is the world’s most cutting-edge home insurance group. We have developed the most sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company.

Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our group includes our scientific research & engineering division (Skynet Software) and Canadian property & casualty insurance company (Max Insurance).

Together, our group is transforming the Canadian and global insurance landscape.

JOB DESCRIPTION:

Chelsea Avondale is looking for a Reliability Engineer with a background in infrastructure system engineering to support the growth of a secure, dynamic, and scalable IT environment across the group. Our business is going through rapid growth, and it is essential that our systems infrastructure keeps pace.

The Reliability Engineer will play a crucial role in ensuring the reliability, scalability, and performance of our systems, enabling the continuous delivery of our products and services. They will be accountable for ensuring overall availability, as well as enhancing Engineering teams’ capability to design, build and operate robust systems at scale.

This position is ideal for candidates who have an extraordinary sense of responsibility and are not afraid to roll up their sleeves. Our IT environment is not toolkit rich. What we are NOT looking for is someone who wants to take months installing a large number of tools from their preferred toolkit. We take pride in maintaining a fundamental stack of technologies, much of it in Python, and we are looking for someone who shares this mentality. We are looking for someone who thrives in a high-performance culture and is eager for work that is both challenging and constantly evolving.

RESPONSIBILITIES:

  • Play an integral role in the design, implementation & maintenance of AWS cloud server environments.
  • Design, implement, and maintain robust monitoring and alerting systems in Python to detect and respond to incidents in a timely manner.
  • Collaborate with cross-functional teams to enhance reliability of our systems and services.
  • Design, configure, deploy, and maintain infrastructure on AWS using best practices and industry standards.
  • Conduct post-incident analysis to identify root causes, implement corrective actions, and prevent similar issues in the future.
  • Assist in capacity planning & optimize services to provide scalable, stable, & secure systems.
  • Implement high availability and disaster recovery solutions to provide data redundancy, resilience, and data loss prevention.
  • Assist with the implementation of select network engineering solutions including firewalls, load balancing, VPNs & LANs, where necessary.

PREFERRED EXPERIENCE & SKILLS:

  • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or related field.
  • 5+ years of experience as a Reliability Engineer or similar role, with a focus on maintaining high-performance, scalable, and reliable web systems.
  • Hands-on experience with AWS cloud environments – instances, CloudWatch, EFS, etc.
  • Profiency at Python is a must.
  • Experience using NGINX for reverse proxy, load balancing, and caching.
  • Experience with Unix / Windows server configuration, administration, performance tuning and troubleshooting.
  • Working knowledge of web technologies (web servers, DNS, SSL, Browsers).
  • Working knowledge of web development processes (source control, deployment, etc).
  • Experience load testing, pen testing, and providing security for cloud resources is beneficial.
#J-18808-Ljbffr

  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada Nityo Infotech Full time

    Job Responsibilities: Objectives of this Role Run the IKP clusters by monitoring availability and taking a holistic view of system health Build tools and automation to manage platform infrastructure and services Improve reliability, quality, and time to upgrade cluster and service versions Measure and optimize system performance and resource utilization,...


  • Old Toronto, Canada Autodesk Full time

    Position Overview Autodesk, the leading Design and Make Software Company, is looking for a Principal Site Reliability Engineer to join the Autodesk Platform Services Engineering team in Toronto, Canada. In this role, you will help build trusted services of APS (Autodesk Platform Services) measured by Service Level Objectives (SLOs) and Mean Time to Recovery...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (9 months 4 days) Published 3 days ago New Relic Data Dog Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will...


  • Old Toronto, Canada eTeam Full time

    Remote Work Duration 4 months - Preference is to find candidates who are willing to be converted to full-time employees. The conversion decision will be made based on performance. Job Description Role Description: Defining and measuring reliability goals—SLIs, SLOs, and error budgets for user journey. Designing for and implementing observability (ELK,...


  • Old Toronto, Canada Lloyds Banking Group Full time

    Job Description - Senior Site Reliability EngineerJOB TITLE: Senior Site Reliability Engineer (SRE)LOCATION: Halifax, Leeds or ManchesterHOURS: Full-timeWORKING PATTERN: Our work style is hybrid, which involves spending at least two days per week, or 40% of our time, at one of our office sites.Who are Lloyds Banking Group and where does this role sit?If you...

  • Reliability Engineer

    4 weeks ago


    Toronto, Ontario, Canada CSG Talent Full time

    Join a Leading Mining Company in Canada as a Reliability Engineer. This is the best opportunity to grow your career in the maintenance department with a large mining company with its global assets.This is residential role and it comes with very attractive salary and a great relocation and living allowances. Description:Make a significant impact by minimizing...

  • Reliability Engineer

    1 month ago


    Toronto, ON, Canada Chelsea Avondale Full time

    Chelsea Avondale is the world’s most cutting-edge home insurance group. We have developed the most sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company. Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our...


  • Old Toronto, Canada Hour Consulting Full time

    Our client, a fast growing Fintech Startup is on a mission to redefine how to protect user identity, providing users secure control over personal information through a privacy compliant network. Their enterprise platform is comprised of three key pillars: strong authentication, user privacy and identity, and uses a combination of biometrics and...


  • Old Toronto, Canada Guidewire Full time

    ESSENTIAL DUTIES AND RESPONSIBILITIES Take a purist SRE approach to shared multi-tenant infrastructure for a resilient SaaS microservice-based containerized systems in addition to customer-centric application environments Oversee and automate the team’s growing presence in AWS Contribute to core infrastructure systems development with features, bug fixes,...


  • Old Toronto, Canada Practice Better Full time

    About us:Practice Better is a leading all-in-one practice management software solution transforming how health & wellness professionals run their practices and support their clients. The company serves 15,000+ customers in over 70+ countries across the globe, and processes hundreds of millions annually in payments on behalf of customers. Over 65% of growth...


  • Old Toronto, Canada The Voleon Group Full time

    Voleon is a technology company that applies state-of-the-art machine learning techniques to real-world problems in finance. For more than 15 years, we have led our industry and worked at the frontier of applying machine learning to investment management. We have become a multi-billion-dollar asset manager, and we have ambitious goals for the future.  ...


  • Old Toronto, Canada Scotiabank Full time

    Press Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: Select how often (in days) to receive an alert: Please be advised that our Careers site will be unavailable from November 28 at 12am ET to November 29 12am ET for scheduled system maintenance. Title: Site Reliability Engineer Requisition ID:...


  • Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada Zendesk Full time

    Job Description Zendesk is a service-first CRM company that builds powerful, customizable software designed to improve customer relations. At Zendesk, we encourage growth, innovation, and believe in giving back to the communities we call home. The ideal candidate will want to join a growing team. You have recent experience with full-stack cloud native...


  • Old Toronto, Canada Zendesk Full time

    Job Description Zendesk is a service-first CRM company that builds powerful, customizable software designed to improve customer relations. At Zendesk, we encourage growth, innovation, and believe in giving back to the communities we call home. The ideal candidate will want to join a growing team. You have recent experience with full-stack cloud native...

  • Senior Design

    22 hours ago


    Old Toronto, Canada Tevapharm Full time

    Press Tab to Move to Skip to Content Link Date:  Aug 8, 2023 Location:  Toronto, Canada, Ontario, M1B2K9 Job Id:  48448 Who we are? Together, we’re on a mission to make good health more affordable and accessible, to help millions around the world enjoy healthier lives. It’s a mission that bonds our people across nearly 60...


  • Old Toronto, Canada Manulife Insurance Malaysia Full time

    Senior Site Reliability Engineer page is loaded Senior Site Reliability Engineer Postuler locations Waterloo, Ontario Toronto, siège social mondial (200 Bloor) time type Temps plein posted on Publié hier job requisition id JR24020202 Nous sommes un fournisseur de services financiers qui s’emploie à faciliter les...


  • Old Toronto, Canada Sentry Full time

    Bad software is everywhere, and we’re tired of it. Sentry is on a mission to help developers write better software faster, so we can get back to enjoying technology.With more than $217 million in funding and 90,000 organizations that believe we’re on to something, we're building performance and error monitoring tools that help companies like Disney,...


  • Old Toronto, Canada NVIDIA Full time

    Site Reliability Engineering (SRE) at NVIDIA Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demand knowledge across...