Site Reliability Engineer/ On-Prem

2 days ago


Toronto, Canada Motion Recruitment Full time

Manager, Talent Services at Motion Recruitment Join a deeply technical platform engineering team focused on building scalable, high‑availability infrastructure in a hybrid environment. This role offers meaningful ownership across core components of the stack—from bare‑metal Kubernetes clusters and custom container runtimes to distributed storage, CI/CD automation, and observability systems. You’ll lead operational efforts that directly impact platform uptime and performance, while building and refining tooling used by multiple engineering teams across the company. The environment is hands‑on and high‑trust, with the expectation that you’ll bring strong systems intuition, operational depth, and a collaborative mindset. You’ll work with ArgoCD, Argo Workflows, GH Actions, and Atlantis in production, while driving improvements in automation, incident response, and developer experience. Required Skills & Experience 4+ years of experience in operations or SRE roles, including production support and platform ownership Strong hands‑on experience operating Kubernetes in bare‑metal or cloud‑native environments Proven ability to design and implement monitoring, disaster recovery, and observability systems (e.g., Prometheus, OpenTelemetry, LGTM stack) Desired Skills & Experience Familiarity with distributed file systems such as Ceph, and on‑prem hardware management Experience with IaC tools like Terraform and Puppet for provisioning and system state management Exposure to physical or virtual network infrastructure (e.g., Juniper) and hybrid cloud topologies Daily Responsibilities Hands‑On Engineering: 75% Team Collaboration & Cross‑Functional Work: 25% Applicants must be currently authorized to work in Canada on a full‑time basis now and in the future. Accommodation will be provided in all parts of the hiring process as required under Motion Recruitment’s Employment Accommodation policy. Applicants need to make their needs known in advance. Posted By: Adrian Cronk #J-18808-Ljbffr



  • (s): Canada : Ontario : Toronto Scotiabank Global Site Full time US$80,000 - US$140,000 per year

    Requisition ID: 244027Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.Overview: As a Site Reliability Engineer (SRE), you will join the Digital Engineering Operations team, responsible for ensuring the operations and reliability of Scotiabank digital applications. You will have the opportunity to drive...


  • (s): Canada : Ontario : Toronto Scotiabank Global Site Full time $105,000 - $170,000 per year

    Requisition ID: 244026Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.Overview: As a Site Reliability Engineer (SRE), you will join the Digital Engineering Operations team, responsible for ensuring the operations and reliability of Scotiabank digital applications. You will have the opportunity to drive...


  • Toronto, Canada Resonaite Full time

    Site Reliability Engineer (DevOps/Release) Our client in the professional services sector is seeking an SRE to enhance service resiliency, automation, and operational excellence across critical cloud and on‑prem workloads. This role focuses on stability engineering, cloud migration readiness, observability, continuous improvement, and L2 production...


  • Toronto, Canada Resonaite Full time

    Site Reliability Engineer (DevOps/Release) Our client in the professional services sector is seeking an SRE to enhance service resiliency, automation, and operational excellence across critical cloud and on‑prem workloads. This role focuses on stability engineering, cloud migration readiness, observability, continuous improvement, and L2 production...


  • Toronto, Ontario, Canada Procom Full time $80,000 - $120,000 per year

    Site Reliability Engineer (SRE)/ Ingénieur Fiabilité des SitesOn behalf of our banking client, Procom is seeking a Site Reliability Engineer (SRE) for a 12-month contract role. This position is a hybrid role, 3 days a week onsite at our client's Montréal, Quebec office.Site Reliability Engineer - Job Description:The Site Reliability Engineer is...


  • Toronto, Canada Maneva Full time

    About ManevaManeva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....


  • Toronto, Canada Maneva Full time

    About Maneva Maneva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....


  • Toronto, Canada Maneva Full time

    About Maneva Maneva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....


  • Toronto, Canada Maneva Full time

    About Maneva Maneva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....


  • Toronto, Canada Maneva Full time

    About Maneva Maneva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....