Senior Site Reliability Engineer

2 weeks ago


Toronto, Ontario, Canada Relayfi Full time

Our mission is to increase the success rate of small businesses. Traditional banking has been a growth limiter rather than a growth enabler for business owners, and we're changing that. Relay is the all-in-one, collaborative money management platform. We're building for employer SMBs and their finance function, internal and external, and are focused on delivering a human-centric customer experience. Ultimately, we help SMBs be 'on the money'.

We're looking for an incredible Senior Site Reliability Engineer to join our Trust team. Your love of making high-impact decisions daily and desire to help shape the future of Relay is going to be crucial. The team's vision is "Protecting the cathedral while enabling the bazaar" - quite a challenge in the scope of our multiple environments.

*Please note that we will only consider applicants that are based in the Eastern Time zone. For those based in the Greater Toronto Area, we have a hybrid work environment and choose to collaborate in the Toronto office twice a week.

What You'll Be Doing:
  • Join the team owning our production infrastructures (AWS, Kubernetes, PostgreSQL databases, Terraform, Terragrunt)
  • Review infrastructure change requests, and triage & fix high-risk security and privacy issues in infrastructure components
  • Write playbooks, and run game days and threat modelling
  • Build monitoring systems to dynamically assess the infrastructure health
  • Improve our data repositories (db, warehouse, lake) posture: engine upgrade, zero-downtime migrations, privacy taggings
  • Provide guidance and mentoring for the rest of the team and help evolve Relay into a world-class security-oriented organization
  • Participate in the on-call rotation
Who You Are:
  • You have 5+ years of experience working in a DevOps or SRE role
  • You have experience as an SRE working with these technologies: AWS, Datadog, Github, GHA, k8s, etc.
  • You have experience as a DBA (Aurora RDS, PostgreSQL, DynamoDB, ElastiCache)
  • You have experience with Terraform, Terragrunt, , Typescript
  • You have a strong security and operation focus; we are looking for someone to help us continue building security into every aspect of our work - and is ready to be on-call for production issues
  • You are a team player - our team is small and mighty, and we collaborate constantly - we want someone who is always willing to pitch in and isn't afraid to ask for help
  • You are curious. You keep yourself on the bleeding edge of infrastructure best practices.
Bonus Points:
  • Show us your home lab We have Ubiquity gears everywhere and we like to geek-out on our k8s clusters that control in-house experience
  • Send us your HackerOne account id - Security permeates everything we are doing
  • You've joined a company at its early stages and have seen it through scale
  • You have experience working in a fintech startup
Our SRE Tech Stack:
  • Container Orchestration: Kubernetes, ArgoCD, ECS
  • Cloud Platform: AWS (DynamoDB, RDS Postgres, Lambda, S3, SQS, SNS, SES, ElasticSearch, ECS, EKS, AND MORE)
  • Monitoring: Datadog
  • Relevant Languages: Javascript/Typescript, GoLang, Python
  • IAC: Terraform/Terragrunt
  • Tools: Github, GHA, Cloudflare
Our Commitment To You:
  • Competitive salary and meaningful equity: every team member gets a piece of the pie.
  • Comprehensive health benefits: we offer full health benefits + an HSA/WSA starting from day 1 so you get the coverage you need.
  • Considerable vacation/end-of-year holiday shutdown: we take time off to reset and recharge so we come back better for our customers.
  • Hybrid work environment: we love collaborating and connecting in the office two times a week and offer catered lunches and a snack/beverage program for the days we're in office. Don't forget to bring in your furry friends
  • Personal and professional growth: support from leaders who care about your growth and success through regular feedback and coaching. Our goal is to make Relay a step-change career opportunity.
  • Top-tier equipment: we'll make sure you have everything you need to produce your best work.
  • Team-first culture: we're passionate about working collaboratively, bonding through team events, and most importantly having fun.
The Interview Process:
  • Stage 1 : A 30-minute Google Meets video call with a member of the Talent Team
  • Stage 2: A 45-minute Google Meets video call with the SRE Lead
  • Stage 3: A 60-minute case study presentation with members of the Trust team
  • Stage 4: A 30-minute Google Meets video call with the Head of Engineering and Co-Founder

Research shows that women-identifying and other marginalized individuals tend to only apply when they meet 100% of the qualifications; if you don't have all the listed qualifications, we encourage you to apply anyway

What's Important to Us:

At Relay, we believe that diversity is key to building high-performing teams, and creating an inclusive work environment is our priority. We are an equal-opportunity employer and we welcome people of diverse backgrounds, perspectives, and skills.

We will work with applicants to provide accommodations at any stage of the hiring process. If you require accommodations during the interview process, please email your People Team contact, and we will work with you to meet your needs.

#J-18808-Ljbffr

  • Toronto, Ontario, Canada Akamai Full time

    Join our highly skilled Site Reliability Engineering team. Our team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. As a Senior SRE in the VHP team, you will be at the forefront of Akamai Connected Cloud compute host technologies. Our team is responsible for the host Linux platform from the...


  • Toronto, Ontario, Canada Manulife Insurance Malaysia Full time $92,190 - $171,210

    Senior Site Reliability Engineer page is loaded Senior Site Reliability Engineer Postuler locations Waterloo, Ontario Toronto, siège social mondial (200 Bloor) time type Temps plein posted on Publié hier job requisition id JR Nous sommes un fournisseur de services financiers qui s'emploie à faciliter les décisions de nos clients et de nos collègues...


  • Toronto, Ontario, Canada OnX Canada Full time

    OnX is looking for a Site Reliability Engineer for one our clients in TorontoClient:Financial Services Location: Toronto, mostly remote Duration: 6 months with potential extension JBoss in middleware experience is super important Responsibilities: Following the senior technicians plans to buil


  • Toronto, Ontario, Canada Thomson Reuters Full time

    Thomson Reuters is seeking a Senior Site Reliability Engineer to join our Service Management, Technology team. This role calls for an individual who is capable of analyzing customer problems of high complexity and assessing the scope of impact, while mitigating customer impact of issues and executing work arounds. Willingness to learn is an important aspect...


  • Toronto, Ontario, Canada Django REST framework Full time

    Relay Financial is hiring a Remote Senior Site Reliability Engineer Our mission is to increase the success rate of small businesses. Traditional banking has been a growth limiter rather than a growth enabler for business owners, and we're changing that. Relay is the all-in-one, collaborative money management platform. We're building for employer SMBs and...


  • Toronto, Ontario, Canada Django REST framework Full time

    Remote Senior Site Reliability Engineer Traditional banking has been a growth limiter rather than a growth enabler for business owners, and we're changing that. Relay is the all-in-one, collaborative money management platform. We're building for employer SMBs and their finance function, internal and external, and are focused on delivering a human-centric...


  • Toronto, Ontario, Canada Akamai Full time

    Are you passionate about cutting edge technology? Do solving some of the Internet's most difficult content delivery challenges interest you? Join our Compute Site Reliability team Our team is responsible for monitoring and measuring the reliability of our suite of Compute products and platform. In collaboration with Engineering and Product teams, we focus...


  • Toronto, Ontario, Canada Relay Financial Full time

    Traditional banking has been a growth limiter rather than a growth enabler for business owners, and we're changing that. Relay is the all-in-one, collaborative money management platform. We're building for employer SMBs and their finance function, internal and external, and are focused on delivering a human-centric customer experience. Ultimately, we help...


  • Toronto, Ontario, Canada Akamai Full time

    Are you passionate about cutting edge technology? Do solving some of the Internet's most difficult content delivery challenges interest you? Join our Compute Site Reliability team Our team is responsible for monitoring and measuring the reliability of our suite of Compute products and platform. In collaboration with Engineering and Product teams, we focus on...


  • Old Toronto, Ontario, Canada Akamai Full time

    Are you intrigued by planetary scale, distributed, intelligent systems? Do you like collaborating across teams to solve complex problems? Join our highly skilled Site Reliability Engineering team. Our team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We do this while maintaining Akamai's...


  • Old Toronto, Ontario, Canada CB Canada Full time

    Site Reliability EngineerOn behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer.Site Reliability Engineer – Job DescriptionAzure cloudJira and confluenceCICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure Kubernetes...


  • Old Toronto, Ontario, Canada Sentry Full time

    Bad software is everywhere, and we're tired of it. Sentry is on a mission to help developers write better software faster, so we can get back to enjoying technology.With more than $217 million in funding and 90,000 organizations that believe we're on to something, we're building performance and error monitoring tools that help companies like Disney,...


  • Toronto, Ontario, Canada Criteo Full time

    What You'll Do: What's a PRE Team? The concept of Product Reliability Engineering (PRE) was born from an industry leading online SRE book (go ahead, "Google" it). At Criteo, we are the bridge between Product and Platform Engineering. The PRE group is composed of 7 teams of people with a wide variety of backgrounds, experiences and perspectives. How You'll...


  • Old Toronto, Ontario, Canada Akamai Full time

    Are you passionate about cutting edge technology? Do solving some of the Internet's most difficult content delivery challenges interest you? Join our Compute Site Reliability team Our team is responsible for monitoring and measuring the reliability of our suite of Compute products and platform. In collaboration with Engineering and Product teams, we focus on...


  • Toronto, Ontario, Canada Autodesk Full time

    Position Overview Virtual and augmented reality are revolutionizing design and creation by offering new immersive experiences to enhance various industries like entertainment, architecture, engineering, and manufacturing. The growth of XR technology is reshaping work dynamics, fostering collaboration, and promoting innovation. As a Senior Site Reliability...


  • Toronto, Ontario, Canada Thomson Reuters Full time

    Do you have a passion for DevOps culture and Site reliability engineering? That is, building and operating scalable, reliable, and secured services that underpin all Thomson Reuters' products. Then we want you on our team As we expand our Service Reliability team in Toronto, we are currently seeking an experienced Senior SRE to join our Shared Capabilities,...


  • Toronto, Ontario, Canada Lloyds Banking Group Full time

    JOB TITLE:Site Reliability Engineer - Homes Platform LOCATION(S): Halifax or LeedsHOURS:[Full-time] Our work style is hybrid, which involves spending at least two days per week currently, or 40% of our time, at our Halifax or Leeds Office Our Cloud SRE (Site Reliability Engineering) team is looking for an experienced and passionate Engineer with strong...


  • Old Toronto, Ontario, Canada NVIDIA Full time

    Site Reliability Engineering (SRE) at NVIDIASite Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demand knowledge across...


  • Toronto, Ontario, Canada Zortech Solutions Full time

    Hi,Hope you are doing GreatThis side Priya Rajput from Zortech Solutions trying to reach you for an exciting job opening, kindly have a look to job description and revert me with your positive feedback. My mail ID is or call me on .Role: Site Reliability EngineerLocation: Toronto, ON-OnsiteDuration: Fulltime PermanentSkills and Responsibilities:...


  • Toronto, Ontario, Canada Zortech Solutions Full time

    Hi,Hope you are doing GreatThis side Priya Rajput from Zortech Solutions trying to reach you for an exciting job opening, kindly have a look to job description and revert me with your positive feedback. My mail ID is or call me on .Role: Site Reliability EngineerLocation: Toronto, ON-OnsiteDuration: Fulltime PermanentSkills and Responsibilities:...