Site Reliability Engineer

3 weeks ago


Canada Orion Innovation Full time

Senior Site Reliability Engineer (SRE) with Kubernetes & Rancher Location: Canada - Remote (Working EST hours) Job Type: Full-time About the Role Are you an exceptional Site Reliability Engineer with a passion for building and maintaining highly resilient and secure systems? We are seeking a Senior SRE to join our team and play a critical role in managing mission‑critical infrastructure. In this role, you will leverage your expertise in Kubernetes and Rancher to ensure the reliability, performance, and security of our systems, particularly within secure, zero‑connectivity environments. This is a unique opportunity for a seasoned SRE who thrives on tackling complex challenges and wants to make a significant impact on system resilience and security. What You’ll Do Design, architect, and maintain highly reliable, multi‑tenant systems using advanced tools like RKE2 and Kubernetes. Serve as the expert on key components such as Ingress, Kong, Artifactory, and Sonar. Implement and manage advanced observability solutions with Prometheus, Grafana, Splunk, and Elastic, ensuring deep visibility into system health and performance, even in air‑gapped settings. Guarantee that all deployments meet stringent compliance standards and are consistently optimized for maximum performance and security. Proactively perform regular code quality analysis and security assessments using Sonar, identifying and mitigating potential vulnerabilities before they become a problem. Partner with our Lead and specialized teams to quickly resolve incidents and drive continuous improvement of system resilience and recovery procedures. Create and maintain meticulous documentation for all system configurations, runbooks, and disaster recovery plans, which is a critical function for managing systems in a classified environment. Required Skills and Qualifications 8+ years of experience as a Site Reliability Engineer. Technical expertise with RKE2, Kubernetes, Ingress, Kong, Artifactory, Prometheus, Grafana, Splunk, Elastic, and Sonar. Strong background and demonstrated experience in Site Reliability Engineering and in implementing comprehensive system observability strategies. Proven experience working in air‑gapped or zero‑connectivity environments, with a deep understanding of the unique challenges and best practices for protecting classified data. Exceptional ability to troubleshoot and optimize complex, multi‑tenant infrastructures under pressure. Preferred Qualifications Relevant SRE or DevOps certifications (e.g., CKAD, CKA) that validate your expertise. Prior experience in government or defense‑related SRE roles. Experience with Rancher and its ecosystem. Seniority level Mid‑Senior level Employment type Full‑time Job function Engineering and Information Technology Industries IT Services and IT Consulting #J-18808-Ljbffr



  • , , Canada Thinkific Full time

    Join to apply for the Senior Site Reliability Engineer role at Thinkific Join to apply for the Senior Site Reliability Engineer role at Thinkific Are you an experienced Site Reliability Engineer looking for a new challenge? We’re looking for a Senior Site Reliability Engineer to join us at Thinkific. We’re looking for a Senior Site Reliability Engineer...


  • (s): Canada : Ontario : Toronto Scotiabank Global Site Full time $120,000 - $180,000 per year

    Requisition ID: 239640Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.The RoleAs a member of the Systems Reliability Engineering team, the System Reliability Engineer will collaborate closely with Engineering and development teams, peers, and business partners to continuously improve the stability,...


  • , , Canada Icon Full time

    Helping SaaS companies scale Engineering teams. Director, Site Reliability Engineering We are seeking an accomplished Director of Site Reliability Engineering (SRE) to lead the reliability, scalability, and performance initiatives across multiple enterprise technology domains, including AML, Risk, Finance, Corporate Treasury, and Human Resources systems....


  • , , Canada Orion Innovation Full time

    Job Description: Senior Site Reliability Engineer (SRE) with Kubernetes & Rancher Location: Canada - Remote (Working EST hours) Job Type: Full-time About the Role Are you an exceptional Site Reliability Engineer with a passion for building and maintaining highly resilient and secure systems? We are seeking a Senior SRE to join our team and play a critical...


  • , , Canada Akamai Technologies Full time

    Senior Site Reliability Engineer Join Akamai Technologies as we build a reliable, secure, and scalable Internet. We are looking for a Senior Site Reliability Engineer to help us solve complex performance and reliability challenges. Job Description Are you passionate about cutting‑edge technology and ready to tackle some of the Internet’s most difficult...


  • , , Canada Targeted Talent Full time

    Overview We are looking for an experienced Senior Site Reliability Engineer for our client. This is a permanent position that is remote to start with later relocation to Calgary or Winnipeg . Our client is a global enterprise company with a product that you've likely used. Experience with coding/software development, along with Site Reliability will be the...


  • , , Canada DuckDuckGo Full time

    6 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. Who We AreHi, we're DuckDuckGo, the online protection company and remote-first team of 300+ on a mission to raise the standard of trust online. Founded in 2008 and profitable since 2014, our annual revenue now exceeds $100 million USD. Millions use our...


  • , , Canada TextNow Full time

    This range is provided by TextNow. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range CA$113,400.00/yr - CA$162,000.00/yr We believe communication belongs to everyone. We exist to democratize phone service. TextNow is evolving the way the world connects and that\'s because we\'re made up of...


  • , , Canada Telna Full time

    Site Reliability Engineer – Security Engineer


  • , , Canada Orion Innovation Full time

    We are seeking a highly specialized and experienced Senior Site Reliability Engineer (SRE) to drive the reliability, performance, and automation of our core platform. This role requires an exceptional blend of deep programming expertise in both Ruby and Go , coupled with hands‑on mastery of Linux systems, advanced networking concepts (specifically IPSec),...