Senior Site Reliability

13 hours ago


Montreal, Canada Canonical Full time

Senior Site Reliability / Gitops Engineer Join Canonical, a leading provider of open‑source software and operating systems, as a Senior Site Reliability / Gitops Engineer. In this role you will drive operations automation and infrastructure as code (IaC) across Canonical’s private and public clouds, leveraging open‑source tools such as CI/CD pipelines, Prometheus, Grafana and Elasticsearch. Job Summary The IS team supports and maintains all of Canonical’s IT production services, managing infrastructure used by over 60 million Ubuntu users. You will act as an embedded tech lead, collaborating with the IS architect and other teams to design and deliver reusable services and products at scale. Responsibilities Develop automation and Gitops solutions, leading as a tech lead within your team. Collaborate with the IS architect to align solutions with the overall architecture vision. Design and architect services that can be offered as products across the organization. Drive IaC practice, increasing automation and improving IaC processes. Automate software operations across private and public clouds, addressing distributed system complexities. Maintain operational responsibility for Canonical’s core services, networks and infrastructure. Develop skills in troubleshooting, capacity planning and performance investigation; set up and maintain observability tools (Prometheus, Grafana, Elasticsearch). Provide assistance and collaborate with globally distributed engineering, operations and support peers. Dedicate uninterrupted development time to larger automation projects. Share experience and best practices through design sessions, mentorship and collaborative work. Take final responsibility for time‑critical escalations. Qualifications Modern view on hosting architecture driven by IaC across private and public clouds. Product mindset focused on developing products rather than one‑off solutions. Python software development experience with large projects. Experience with Kubernetes or other container orchestration systems. Proven experience managing and deploying cloud infrastructure with code. Practical knowledge of Linux networking, routing and firewalls. Affinity with Linux storage (Ceph, databases, etc.). Hands‑on experience administering enterprise Linux servers. Extensive knowledge of cloud computing concepts and technologies. Bachelor’s degree or higher, preferably in computer science or related engineering field. Effective communication skills in English. Motivation and ability to troubleshoot from kernel to web and to ask for help when appropriate. Willingness to learn quickly and adapt to fast‑changing environments. Enjoyment working within distributed teams. Passion for open‑source, especially Ubuntu or Debian. Benefits Distributed work environment with twice‑yearly in‑person sprints. Personal learning and development budget of USD 2,000 per year. Annual compensation review and performance‑driven bonus or commission. Recognition rewards and annual holiday leave. Maternity and paternity leave. Team member assistance program and wellness platform. Opportunities to travel to new locations and meet colleagues. Priority Pass and travel upgrades for long‑haul company events. About Canonical Canonical is a pioneering tech firm at the forefront of the global shift to open source. As the publisher of Ubuntu, one of the most important open‑source projects, Canonical supports AI, IoT and cloud initiatives worldwide. We recruit globally, maintain a high standard of excellence, and many colleagues work from home. Canonical fosters an inclusive workplace free from discrimination. Diversity of experience, perspectives and background creates a better work environment and better products. #J-18808-Ljbffr



  • Montreal, Canada ApTask Full time

    Direct message the job poster from ApTask Looking for an intermediate between 2 to 5 years' experience. The Application Infrastructure (Al) department is seeking a Site Reliability Engineer (SRE) to help drive the reliability engineering, operations and customer support services clients ServiceNow SaaS implementation. Reporting to a Site Reliability...


  • Montreal, Canada Open Systems Technologies Full time

    Site Reliability Engineer (SRE), ServiceNow, Application Infrastructure Location: Montreal – Hybrid – 3 days/week The Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive reliability engineering, operations and customer support services for client’s ServiceNow SaaS implementation. Reporting to a Site...


  • Montreal West, Canada Orion Innovation Full time

    1 week ago Be among the first 25 applicants Orion Innovation is a premier, award‑winning, global business and technology services firm. Orion delivers game‑changing business transformation and product development rooted in digital strategy, experience design, and engineering, with a unique combination of agility, scale, and maturity. We work with a wide...


  • Montreal West, Canada Orion Innovation Full time

    1 week ago Be among the first 25 applicants Orion Innovation is a premier, award‑winning, global business and technology services firm. Orion delivers game‑changing business transformation and product development rooted in digital strategy, experience design, and engineering, with a unique combination of agility, scale, and maturity. We work with a wide...


  • Montreal, Canada Devopshunt Full time

    We are seeking a Senior Site Reliability Engineer (SRE) to join our team and play a key role in ensuring the reliability, scalability, and security of our hybrid AWS infrastructure. Reporting to the Digital Infrastructure Team Lead , you will collaborate with cross-functional teams to design and optimize cloud platforms, streamline developer workflows, and...


  • Montreal, Canada ApTask Full time

    Direct message the job poster from ApTaskLooking for an intermediate between 2 to 5 years' experience.The Application Infrastructure (Al) department is seeking a Site Reliability Engineer (SRE) to help drive the reliabilityengineering, operations and customer support services clients ServiceNow SaaS implementation.Reporting to a Site Reliability Engineering...


  • Montreal, Canada Canonical Full time

    Join to apply for the Senior Site Reliability Engineer role at Canonical Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our...


  • Montreal, Canada Devopshunt Full time

    We are seeking a Senior Site Reliability Engineer (SRE) to join our team and play a key role in ensuring the reliability, scalability, and security of our hybrid AWS infrastructure. Reporting to the Digital Infrastructure Team Lead , you will collaborate with cross-functional teams to design and optimize cloud platforms, streamline developer workflows, and...


  • Montreal, Canada Intelcom Express Inc. Full time

    Senior Site Reliability Engineer (SRE) page is loaded## Senior Site Reliability Engineer (SRE)locations: Canada, Quebec, Montrealtime type: Full timeposted on: Posted Todayjob requisition id: JR109652# **Ride the next mile with us!**### **Responsibilities*** ### **Incident Management**: Detect and respond to issues, ensuring rapid recovery to...


  • Montreal, Canada Intelcom Express Inc. Full time

    Senior Site Reliability Engineer (SRE) page is loaded## Senior Site Reliability Engineer (SRE)locations: Canada, Quebec, Montrealtime type: Full timeposted on: Posted Todayjob requisition id: JR109652# **Ride the next mile with us!**### **Responsibilities*** ### **Incident Management**: Detect and respond to issues, ensuring rapid recovery to...