Site Reliability Engineer

4 weeks ago


Toronto, Canada Hopper Full time
About the job We are looking for a senior Site Reliability Engineer to join the Platform Infrastructure team at Hopper. We manage a large infrastructure in Google Cloud that is used by hundreds of engineers worldwide to provide a first class experience to millions of end users around the world.If the traits described below sounds like you, this role might be a great fitYou are passionate about automating everything possible.You like to provide the best user experience to the engineers deploying services with their corresponding infrastructure on top of the platform we build and manage.You also like the infrastructure to be as scalable, reliable, secure and optimized as possible.You like to solve problems in a practical way, building solutions that are simple, reliable, cost effective and easy to use.

What would your day-to-day look like:

You will be improving and evolving our current infrastructure by improving and evolving the tooling that is used to manage it. As we keep growing and adding services, we have to adapt our infrastructure offering to the engineering teams so that innovation can keep happening at all layers of the company. We do this while having an infrastructure as simple and homogeneous as possible. You will also participate in providing support to incidents and be part of on-call rotation for platform incidents, as each engineering team has their own on-call rotation (Team is scattered across America and Europe, so you can sleep at night). You will also contribute to resolving doubts and problems engineers might face with our infrastructure and approving PRs that require Platform supervision. You will be part of a small and highly efficient team of SREs.

An ideal candidate has:

Strong background in SRE, DevOps, Software Engineering or Systems engineering Troubleshooting skills System design with good analytical capabilities Good communication skills Knowledge of major cloud providers, preferably Google Cloud Infrastructure as Code, preferably with Terraform Containers, Kubernetes and related tooling like Kustomize and Helm Service Mesh, preferably with Istio Networking knowledge. DNS, TLS, certificates, ingresses, etc. Observability with log collection, metrics, APM, etc. preferably Datadog Security knowledge, IAM, RBAC, network security, etc. Knowledge on authentication and authorization technologies CI/CD Database technologies Competent in scripting with Bash and Python or other scripting languages

Perks and benefits of working with us:

Well-funded and proven startup with large ambitions, competitive salary and stock options/ RSUs Unlimited PTO Flexport All Access Pass OR Work-from-home stipend Entrepreneurial culture where pushing limits and taking risks is everyday business Open communication with management and company leadership Small, dynamic teams = massive impact 100% employer-paid telemedicine, medical, dental, vision, disability and life insurance plans Access to a 401K plan or RRSP (depending if located in USA or Canada) #posttoexternal More about Hopper At Hopper, we are on a mission to become the leading travel platform globally – powering Hopper’s mobile app, website and our B2B business, HTS (Hopper Technology Solutions). By leveraging massive amounts of data and advanced machine learning algorithms, Hopper combines its world-class travel agency offering with proprietary fintech products to bring transparency, flexibility and savings to travelers globally. We have developed several unique fintech solutions that address everything from pricing volatility to trip disruptions – helping people travel better and save more on their trips.The Hopper platform serves hundreds of millions of travelers globally and continues to capture market share around the world. Ranked the third largest online travel agency in North America, the Hopper app has been downloaded over 120 million times and has become largely popular among younger travelers – with 70% of its users being Gen Z and millennials.While everyone knows us as the Gen Z and Millennial travel app, Hopper has evolved to become much more than that. In recent years, we’ve grown into a global travel agency and travel fintech provider that powers some of the world’s largest brands. Through , our B2B division, the company supercharges its partners’ direct channels by integrating our fintech products on their sites or powering end-to-end travel portals. Today, our partners include leading travel brands like Capital One, Nubank, Air Canada and many more.Here are just a few stats that demonstrate the company’s recent growth:Hopper sells billion worth of travel and travel fintech every year. In 2023, over billion trips were planned through the Hopper app and our HTS partnerships. Our fintech products – including Price Freeze, Flexibility for Any Reason and Flight Disruption Assistance – have exceptionally strong CSAT because the terms are always clear, and customers receive instant, no-questions-asked resolutions. Almost 30% of our app customers purchase at least one fintech product when making a booking; and consumers are more likely to repurchase if they add fintech to their booking vs if they booked just travel. Given the success of its fintech products, Hopper launched a B2B initiative, HTS (Hopper Technology Solutions), which represents more than 50% of the business. Through HTS, any travel provider (airlines, hotels, banks, travel agencies, etc.) can integrate and seamlessly distribute Hopper’s fintech or travel inventory on their direct channels. As its first HTS partnership, the company partnered with Capital One to co-develop Capital One Travel, a new travel portal designed specifically for cardholders. Other HTS partners include Air Canada, Uber, CommBank, Nubank, Flair Airlines with many more in the pipeline. Featured in Apple’s Best of the App Store list of Essential Travel Apps in 2023 and recognized by the likes of Fast Company’s Most Innovative Companies, Hopper has been downloaded over 120 million times and continues to have millions of new installs each month. Hopper is now the #3 largest online travel agency in North America and 70% of our app customers are Gen-Z and millennials travelers.Hopper has raised over $750 million USD of private capital and is backed by some of the largest institutional investors and banks in the world. Hopper is primed to continue its acceleration as the world’s fastest-growing mobile-first travel marketplace. Come take off with us

  • Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Ontario, Canada CB Canada Full time

    Site Reliability EngineerOn behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer.Site Reliability Engineer – Job DescriptionAzure cloudJira and confluenceCICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure Kubernetes...


  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada Reperio Human Capital Full time

    Site Reliability Engineer 100421 Desired skills: Site Reliability Engineer, SRE, Cloud, Permanent, Remote Site Reliability Engineer Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience...


  • Old Toronto, Canada Reperio Human Capital Full time

    Site Reliability Engineer 100421 Desired skills: Site Reliability Engineer, SRE, Cloud, Permanent, Remote Site Reliability Engineer Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience...


  • Old Toronto, Canada Reperio Human Capital Full time

    Site Reliability Engineer 100421 Desired skills: Site Reliability Engineer, SRE, Cloud, Permanent, Remote Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience with monitoring and...


  • Old Toronto, Canada Reperio Human Capital Full time

    Site Reliability Engineer 100421 Desired skills: Site Reliability Engineer, SRE, Cloud, Permanent, Remote Location: Ireland/UK Salary: €70K+ Type: Permanent, Full-time We're seeking experienced Site Reliability Engineers who excel at ensuring the reliability and scalability of production systems, and possess extensive experience with monitoring and...


  • Toronto, Canada Infotek Consulting Services Inc. Full time

    Infotek Consulting is searching for a Site Reliability Engineer - this is a remote opportunity with some travel involved Job Description: Our EPM (Event and Performance Management) team is availability, performance and reliability management discipline that supports the optimization of the operati


  • Toronto, Canada Sigmaways Inc Full time

    We're seeking a Site Reliability Engineer to join our team with expertise in Kubernetes and troubleshooting.Responsibilities:Monitor, measure, and report alerts, overall health, performance, and capacity of one or more services.Gain deep knowledge and learn the application stack.Ability to debug and optimize code and automate routine tasks.Function well in a...


  • Toronto, Ontario, Canada Zortech Solutions Full time

    Hi,Hope you are doing GreatThis side Priya Rajput from Zortech Solutions trying to reach you for an exciting job opening, kindly have a look to job description and revert me with your positive feedback. My mail ID is or call me on .Role: Site Reliability EngineerLocation: Toronto, ON-OnsiteDuration: Fulltime PermanentSkills and Responsibilities:...


  • Toronto, Ontario, Canada Zortech Solutions Full time

    Hi,Hope you are doing GreatThis side Priya Rajput from Zortech Solutions trying to reach you for an exciting job opening, kindly have a look to job description and revert me with your positive feedback. My mail ID is or call me on .Role: Site Reliability EngineerLocation: Toronto, ON-OnsiteDuration: Fulltime PermanentSkills and Responsibilities:...


  • Old Toronto, Canada Epsilon Solutions Ltd. Full time

    Job Title: Site Reliability EngineerLocation: Toronto, ONSkills And Responsibilities Collaborate with teams to enhance application and transaction scalability using Azure Kubernetes Service (AKS) and Azure scalability features. Develop application monitoring strategies using New Relic, Devo, and Azure Monitor, including creating monitors and dashboards....


  • toronto, Canada OnX Canada Full time

    OnX is looking for a Site Reliability Engineer for one our clients in Toronto. Client: Financial Services Location: Toronto, mostly remote Duration: 6 months with potential extension JBoss in middleware experience is super important Responsibilities: Following the senior technicians plans to buil


  • toronto, Canada OnX Canada Full time

    OnX is looking for a Site Reliability Engineer for one our clients in Toronto. Client: Financial Services Location: Toronto, mostly remote Duration: 6 months with potential extension JBoss in middleware experience is super important Responsibilities: Following the senior technicians plans to buil


  • Toronto, Ontario, Canada OnX Canada Full time

    OnX is looking for a Site Reliability Engineer for one our clients in TorontoClient:Financial Services Location: Toronto, mostly remote Duration: 6 months with potential extension JBoss in middleware experience is super important Responsibilities: Following the senior technicians plans to buil


  • Toronto, Ontario, Canada Lorven Technologies Full time

    Job Title : Site Reliability Engineer (SRE) Location : Toronto, CA Duration : Long term A Bachelor's degree in Computer Science or related technical field (Example: Mathematics/Engineering/Physics), or equivalent practical experience. Advanced knowledge of the following SRE practices and technologies In-depth hands-on experience in a variety of...


  • Toronto, Ontario, Canada Hour Consulting Full time

    Our client, a fast growing Fintech Startup is on a mission to redefine how to protect user identity, providing users secure control over personal information through a privacy compliant network. This approach creates higher customer interaction and sales conversions, while improving overall security for both customers and businesses. They are a team based...


  • Toronto, Canada Sigmaways Inc Full time

    We're seeking a Site Reliability Engineer to join our team with expertise in Kubernetes and troubleshooting. Responsibilities: Monitor, measure, and report alerts, overall health, performance, and capacity of one or more services. Gain deep knowledge and learn the application stack. Ability to debug and optimize code and automate routine tasks. Function...