Senior Site Reliability Engineer

24 hours ago

Canada Thinkific Full time

Join to apply for the Senior Site Reliability Engineer role at Thinkific Join to apply for the Senior Site Reliability Engineer role at Thinkific Are you an experienced Site Reliability Engineer looking for a new challenge? We’re looking for a Senior Site Reliability Engineer to join us at Thinkific. We’re looking for a Senior Site Reliability Engineer (SRE) to join us at Thinkific. In this role, you’ll take ownership of scaling, securing, and optimizing the infrastructure that powers thousands of online course creators around the world. In this role, you will improve the performance, reliability, and security of our platform by partnering with cross-functional teams, driving SRE best practices, and ensuring operational excellence. As a senior member of the team, you’ll mentor others, lead key reliability initiatives and be a hands-on contributor in infrastructure projects, and act as a domain expert in modern cloud-native practices, with a specific emphasis on Kubernetes, cloud infrastructure (AWS), observability, and service reliability. Your goal will be to help guide and execute on projects related to your technical domain. Here’s how you’ll accomplish this: Own and improve technical domains across our infrastructure, ensuring high standards of system reliability, performance, scalability, and security. Design and implement scalable infrastructure using Kubernetes, AWS services (EKS, RDS, S3, IAM, ALB, etc.), and Infrastructure-as-Code tools like Terraform and Helm. Enhance and maintain deployment pipelines, enabling teams to release with speed, confidence, and security. Participate in and lead incident response efforts, ensuring blameless postmortems and continuous learning. Collaborate with development teams to define SLOs, SLIs, and error budgets, promoting reliability-focused design from the start. Automate operational tasks and improve developer experience with scripts and tools written in Ruby, Python, Node.js, or Bash. Maintain observability using tools such as Datadog, New Relic, Prometheus, Grafana, and Sentry—ensuring monitoring and alerting align with meaningful SLOs. Support and optimize distributed systems including relational and non-relational databases, message queues, and asynchronous architectures. Mentor and coach other engineers, helping raise the technical bar and foster a culture of collaboration and operational excellence. Participate in on-call rotation to help maintain a high level of service reliability. The person we have in mind likely: Has 5+ years of software or infrastructure engineering experience, with 3+ years in Site Reliability or DevOps-focused roles Has strong experience operating Kubernetes in production environments Proven AWS experience with infrastructure and services such as EKS, RDS, S3, IAM, and ALB Proficiency with Infrastructure-as-Code (Terraform, Helm) and automation tools. Strong scripting/coding ability in Ruby, Python, Node.js, or Bash Experience with monitoring and observability tools (New Relic, Datadog, Prometheus, Grafana, etc.) Solid understanding of networking, TLS, encryption protocols, and distributed systems. Experience improving CI/CD pipelines and secure software supply chains Familiarity with Cloudflare, CDN configuration, and load balancing strategies. Strong problem-solving skills, ownership mentality, and the ability to thrive in a fast-paced environment Enjoys collaborating across teams and helping shape engineering roadmaps and architectural direction Brings a strong ownership mentality, cares deeply about developer experience and operational excellence, and thrives in a fast-paced environment These things would also be nice, but we think you could learn them on the job: Experience working with Ruby on Rails and/or Node.js applications in production Familiarity with Cloudflare, load balancing strategies, and CDN configuration Experience improving CI/CD pipelines and secure software supply chains CKA certification or equivalent Kubernetes expertise. We’re committed to fair and transparent pay that reflects both where you are and where you can grow to. This role has a salary range of $111,100 – $138,900 – $166,700 in Canada, designed to capture the full journey from developing skills to excelling in the position. Most new hires start between the minimum and midpoint, which aligns with being fully capable in the role. Salaries above the midpoint are typically reserved for team members who have demonstrated strong, consistent performance, deep expertise, and a significant positive impact within the role. For high-demand or hard-to-fill positions like this one, we may hire above midpoint for candidates who bring exceptional experience, skills, or impact potential. Diversity, Equity, Inclusion and Belonging & Accessibility This is just our initial idea of who we’re looking for At Thinkific, we know that people have unique career journeys. If your experience is close to what we’ve described but you feel that you might be missing a few of the requirements, please still apply We believe in equal opportunity and are committed to diversity, equity, inclusion, and belonging across every facet of our business. We’re also committed to providing a comfortable and accessible interview experience for every candidate. If there are any accommodations our team can make throughout our hiring process (big or small), please let us know. Seniority level Seniority level Mid-Senior level Employment type Employment type Full-time Job function Industries E-Learning Providers Referrals increase your chances of interviewing at Thinkific by 2x Get notified about new Site Reliability Engineer jobs in Canada . Site Reliability Engineer - Observability Site Reliability Engineer | North America | Canada | Europe | Fully Remote Site Reliability Engineer - Observability Senior Site Reliability Engineer (Remote) Canada $180,000.00-$230, months ago Performance Engineer / Performance Analyst Senior Site Reliability Engineer, Environment Automation We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr

Senior Site Reliability Engineer

1 day ago

, , Canada DuckDuckGo Full time

6 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. Who We AreHi, we're DuckDuckGo, the online protection company and remote-first team of 300+ on a mission to raise the standard of trust online. Founded in 2008 and profitable since 2014, our annual revenue now exceeds $100 million USD. Millions use our...
Site Reliability Engineer

2 hours ago

(s): Canada : Ontario : Toronto Scotiabank Global Site Full time

Requisition ID: 245210Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.The TeamGlobal Banking and Markets Engineering (GBME) is the fast-moving, award-winning technology engine that powers Scotiabank's Corporate, Investment Banking and Capital Markets businesses.The RoleGBME is searching for a Site...
Senior Site Reliability Engineer

2 weeks ago

, , Canada Sage Recruiting Inc. Full time

This range is provided by Sage Recruiting Inc.. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range CA$180,000.00/yr - CA$200,000.00/yr Senior Site Reliability Engineer (Founding Role) Location: Canada About the Role This team is building a brand-new fintech platform from the ground up and is...
Senior Site Reliability Engineer

1 day ago

, , Canada TextNow Full time

This range is provided by TextNow. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range CA$113,400.00/yr - CA$162,000.00/yr We believe communication belongs to everyone. We exist to democratize phone service. TextNow is evolving the way the world connects and that\'s because we\'re made up of...
Site Reliability Engineer

2 weeks ago

(s): Canada : Ontario : Toronto Scotiabank Global Site Full time

Requisition ID: 244026Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.Overview: As a Site Reliability Engineer (SRE), you will join the Digital Engineering Operations team, responsible for ensuring the operations and reliability of Scotiabank digital applications. You will have the opportunity to drive...
Site Reliability Engineer

1 day ago

(s): Canada : Ontario : Toronto Scotiabank Global Site Full time

Requisition ID: 244027Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.Overview: As a Site Reliability Engineer (SRE), you will join the Digital Engineering Operations team, responsible for ensuring the operations and reliability of Scotiabank digital applications. You will have the opportunity to drive...
Senior Site Reliability Engineer

24 hours ago

, BC, Canada Orion Innovation Full time

Overview Senior Site Reliability Engineer (SRE) with Kubernetes and Rancher. Full-time role focused on building and maintaining highly resilient, secure systems, including in air-gapped environments. Responsibilities System Architecture & Management: Design, architect, and maintain highly reliable, multi-tenant systems using Kubernetes and related tools...
Senior Site Reliability Engineer

2 hours ago

Canada Jobgether Full time

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in Canada. We are looking for an experienced Senior Site Reliability Engineer to help scale and secure a high-traffic, rapidly growing platform. In this role, you will be responsible for ensuring system reliability,...
Senior Site Reliability Engineer

1 day ago

, , Canada D-Wave Full time

Join to apply for the Senior Site Reliability Engineer role at D‑Wave . D‑Wave (NYSE: QBTS) is a leader in the development and delivery of quantum computing systems, software, and services. We are the world’s first commercial supplier of quantum computers, and the only company building both annealing and gate‑model quantum computers. Our mission is...
Site Reliability Engineer

1 day ago

, , Canada Bitcomplete Full time

Join us as a Senior Site Reliability Engineer to help us run an industry-scale GPU cluster via Kubernetes. Together with senior members of our team, you will combine your strong understanding of system scaling and security practices with your cloud-native expertise to stand up and maintain Kubernetes clusters from scratch. Your role will also be pivotal in...

Americas

Europe

Asia / Oceania

Africa

Senior Site Reliability Engineer