Site Reliability Engineer

3 weeks ago


Toronto Montreal Calgary Vancouver Edmonton Old Toronto Ottawa Mississauga Quebec Winnipeg Halifax Saskatoon Burnaby Hamilton Victoria Surrey Halton Hills London Regina Markham Brampton Vaughan Kelowna Laval Southwestern Ontario R, Canada Orion Innovation Full time

Senior Site Reliability Engineer (SRE) with Kubernetes & Rancher Location: Canada - Remote (Working EST hours) Job Type: Full-time About the Role Are you an exceptional Site Reliability Engineer with a passion for building and maintaining highly resilient and secure systems? We are seeking a Senior SRE to join our team and play a critical role in managing mission‑critical infrastructure. In this role, you will leverage your expertise in Kubernetes and Rancher to ensure the reliability, performance, and security of our systems, particularly within secure, zero‑connectivity environments. This is a unique opportunity for a seasoned SRE who thrives on tackling complex challenges and wants to make a significant impact on system resilience and security. What You’ll Do Design, architect, and maintain highly reliable, multi‑tenant systems using advanced tools like RKE2 and Kubernetes. Serve as the expert on key components such as Ingress, Kong, Artifactory, and Sonar. Implement and manage advanced observability solutions with Prometheus, Grafana, Splunk, and Elastic, ensuring deep visibility into system health and performance, even in air‑gapped settings. Guarantee that all deployments meet stringent compliance standards and are consistently optimized for maximum performance and security. Proactively perform regular code quality analysis and security assessments using Sonar, identifying and mitigating potential vulnerabilities before they become a problem. Partner with our Lead and specialized teams to quickly resolve incidents and drive continuous improvement of system resilience and recovery procedures. Create and maintain meticulous documentation for all system configurations, runbooks, and disaster recovery plans, which is a critical function for managing systems in a classified environment. Required Skills and Qualifications 8+ years of experience as a Site Reliability Engineer. Technical expertise with RKE2, Kubernetes, Ingress, Kong, Artifactory, Prometheus, Grafana, Splunk, Elastic, and Sonar. Strong background and demonstrated experience in Site Reliability Engineering and in implementing comprehensive system observability strategies. Proven experience working in air‑gapped or zero‑connectivity environments, with a deep understanding of the unique challenges and best practices for protecting classified data. Exceptional ability to troubleshoot and optimize complex, multi‑tenant infrastructures under pressure. Preferred Qualifications Relevant SRE or DevOps certifications (e.g., CKAD, CKA) that validate your expertise. Prior experience in government or defense‑related SRE roles. Experience with Rancher and its ecosystem. Seniority level Mid‑Senior level Employment type Full‑time Job function Engineering and Information Technology Industries IT Services and IT Consulting #J-18808-Ljbffr



  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Surrey, Victoria, London, Halton Hills, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Tecsys Inc. Full time

    Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Surrey, Victoria, London, Halton Hills, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Tecsys Inc. Full time

    Get AI-powered advice on this job and more exclusive features.Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end....


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Orion Innovation Full time

    Job Description: Senior Site Reliability Engineer (SRE) with Kubernetes & Rancher Location: Canada - Remote [Working EST hours] Job Type: Full-time About the Role Are you an exceptional Site Reliability Engineer with a passion for building and maintaining highly resilient and secure systems? We are seeking a Senior SRE to join our team and play a critical...


  • Ottawa, Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Canonical Full time

    OverviewJoin to apply for the Site Reliability Engineer role at Canonical.Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our...

  • Site Reliability

    5 days ago


    Winnipeg, Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Halifax, Saskatoon, Burnaby, Hamilton, Surrey, Victoria, London, Halton Hills, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Canonical Full time

    Join to apply for the Site Reliability / Gitops Engineer role at Canonical1 day ago Be among the first 25 applicantsJoin to apply for the Site Reliability / Gitops Engineer role at CanonicalGet AI-powered advice on this job and more exclusive features.Canonical is a leading provider of open source software and operating systems to the global enterprise and...


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Targeted Talent Full time

    OverviewWe are looking for an experienced Senior Site Reliability Engineer for our client. This is a permanent position that is remote to start with later relocation to Calgary or Winnipeg. Our client is a global enterprise company with a product that you've likely used. Experience with coding/software development, along with Site Reliability will be the key...


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Orion Innovation Full time

    We are seeking a highly specialized and experienced Senior Site Reliability Engineer (SRE) to drive the reliability, performance, and automation of our core platform. This role requires an exceptional blend of deep programming expertise in both Ruby and Go, coupled with hands‑on mastery of Linux systems, advanced networking concepts (specifically IPSec),...


  • Edmonton, Toronto, Montreal, Calgary, Vancouver, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Surrey, Victoria, London, Halton Hills, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Canonical Full time

    OverviewJoin to apply for the Senior Site Reliability Engineer role at Canonical.Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. We operate...


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada TekRek Full time

    This range is provided by TekRek. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range CA$90.00/hr - CA$120.00/hr Senior Site Reliability Engineer – Distributed Systems, Kubernetes, AWS/GCP The Company TekRek has partnered with a fast‑scaling AI infrastructure company building one of the...


  • Vancouver, Toronto, Montreal, Calgary, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Chainlink Labs Full time

    Join to apply for the Senior Site Reliability Engineer role at Chainlink LabsJoin to apply for the Senior Site Reliability Engineer role at Chainlink LabsGet AI-powered advice on this job and more exclusive features.About UsChainlink Labs is the primary contributing developer of Chainlink, the decentralized computing platform powering the verifiable web....