Site Reliability Engineer

4 weeks ago


Toronto Montreal Calgary Vancouver Edmonton Old Toronto Ottawa Mississauga Quebec Winnipeg Halifax Saskatoon Burnaby Hamilton Victoria Surrey Halton Hills London Regina Markham Brampton Vaughan Kelowna Laval Southwestern Ontario R, Canada Tyk Full time

About Tyk The Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services. Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across the retail, finance, telecoms, healthcare, or media industries (to name just a few). If you've banked online, used an app to check the news, or perhaps even driven a connected car, API's, and by extension, Tyk, make that possible. Founded in 2015 with offices in London – UK, London – Ontario, Atlanta and Singapore, we have many thousands of users of our B2B platform across the globe. Brands using Tyk range from Lotte, Bell, T Mobile, to RBS, Capital One and Vinci. We have a varied user base hailing from every continent - even Antarctica. Mission Tyk is on a mission to connect every system in the world. We've started by building an API Management platform. The role We're looking for a Site Reliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to improve, as we will look to you for new ideas, solutions and metrics on how we can improve the platform. You will also be our first line of incident management to our clients and will help define our response going forward. This is a great opportunity to become an integral part of Tyk as we continue on our journey. Responsibilities Maintain the global Tyk Cloud within SLAs, and help define them. Identify reliability issues and work with your squad to solve them. Introduce new metrics and build relevant dashboards. Participate in the on‑call rotation. Expand the multi‑region and multi‑cloud reach of the platform. Document operational knowledge. Conduct post‑incident analysis. Automate common tasks. Shape and contribute to the continuous improvement agenda – see user stories, estimation, communication, etc. Ensure reliability of the new global Tyk Cloud platform. Automate operations and support. Write and maintain documentation on SRE processes and policies. Recommend and implement ways to drive operational efficiency and reduce cost without impacting service. Assist in penetration testing for the Cloud by liaising with our provider, providing technical details, and environment setup. Manage incident response. Experience Strong collaboration skills. Launching and operating production‑scale Kubernetes clusters. Designing and operating infrastructure on AWS and other cloud providers. Operating MongoDB (or other document database) clusters. Operating Redis (or other key‑value storage) clusters. Administering Linux servers. Maintaining distributed software. Operating Prometheus and Grafana. Operating logging collection and analysis systems. Participating in the on‑call rotation (16:00 pm – 4:00 am UTC). Skills Kubernetes & containers (advanced). AWS/EKS (advanced). Linux (advanced). Terraform and IaC in general (proficient). Helm (proficient). Go and/or Python (familiar). MongoDB (or similar). Redis (or similar). Monitoring – Prometheus, Grafana, Thanos (familiar). Knowledge of networking concepts (subnets, routing, peering, load balancing, NAT, etc.). Common networking protocols (DNS, TCP/IP, HTTP, TLS, UDP). Proactive, energetic, innovative and change‑oriented. Nice to have Experience with GCP or Azure. Bare‑metal infrastructure engineering. API management experience. Large‑scale distributed storage management. Familiarity with Rancher. CKA/CKAD/CKS certification. Creating and delivering production software in Go language. Benefits Everyone has unlimited paid holiday. Total flexibility in hours. Employee share scheme. Generous maternity and paternity leave. Company retreats. Values & Culture It’s okay to make mistakes – failures are learning opportunities. Untested ideas are not stupid. Trust starts with you – make it count. Assume best intent and support each other. Always try to leave things better than you found them. Equal Opportunity We’re an equal‑opportunity employer and we are determined to ensure that no applicant or employee receives less favourable treatment on the grounds of gender, age, disability, religion, belief, sexual orientation, marital status, or race, or is disadvantaged by conditions or requirements which cannot be shown to be justifiable. You can learn more about us at https://tyk.io. #J-18808-Ljbffr



  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Orion Innovation Full time

    Senior Site Reliability Engineer (SRE) with Kubernetes & Rancher Location: Canada - Remote (Working EST hours) Job Type: Full-time About the Role Are you an exceptional Site Reliability Engineer with a passion for building and maintaining highly resilient and secure systems? We are seeking a Senior SRE to join our team and play a critical role in managing...


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Surrey, Victoria, London, Halton Hills, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Tecsys Inc. Full time

    Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Surrey, Victoria, London, Halton Hills, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Tecsys Inc. Full time

    Get AI-powered advice on this job and more exclusive features.Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end....


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Orion Innovation Full time

    Job Description: Senior Site Reliability Engineer (SRE) with Kubernetes & Rancher Location: Canada - Remote [Working EST hours] Job Type: Full-time About the Role Are you an exceptional Site Reliability Engineer with a passion for building and maintaining highly resilient and secure systems? We are seeking a Senior SRE to join our team and play a critical...


  • Ottawa, Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Canonical Full time

    OverviewJoin to apply for the Site Reliability Engineer role at Canonical.Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our...

  • Site Reliability

    1 week ago


    Winnipeg, Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Halifax, Saskatoon, Burnaby, Hamilton, Surrey, Victoria, London, Halton Hills, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Canonical Full time

    Join to apply for the Site Reliability / Gitops Engineer role at Canonical1 day ago Be among the first 25 applicantsJoin to apply for the Site Reliability / Gitops Engineer role at CanonicalGet AI-powered advice on this job and more exclusive features.Canonical is a leading provider of open source software and operating systems to the global enterprise and...


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Targeted Talent Full time

    OverviewWe are looking for an experienced Senior Site Reliability Engineer for our client. This is a permanent position that is remote to start with later relocation to Calgary or Winnipeg. Our client is a global enterprise company with a product that you've likely used. Experience with coding/software development, along with Site Reliability will be the key...


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Orion Innovation Full time

    We are seeking a highly specialized and experienced Senior Site Reliability Engineer (SRE) to drive the reliability, performance, and automation of our core platform. This role requires an exceptional blend of deep programming expertise in both Ruby and Go, coupled with hands‑on mastery of Linux systems, advanced networking concepts (specifically IPSec),...


  • Edmonton, Toronto, Montreal, Calgary, Vancouver, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Surrey, Victoria, London, Halton Hills, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Canonical Full time

    OverviewJoin to apply for the Senior Site Reliability Engineer role at Canonical.Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. We operate...


  • Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada TekRek Full time

    This range is provided by TekRek. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range CA$90.00/hr - CA$120.00/hr Senior Site Reliability Engineer – Distributed Systems, Kubernetes, AWS/GCP The Company TekRek has partnered with a fast‑scaling AI infrastructure company building one of the...