Senior Site Reliability Engineer

2 weeks ago


Vancouver, British Columbia, Canada Red Hat Canada Limited (f.k.a Cygnus Solutions Canada Limited) Full time

About the Job

Red Hat is seeking a Senior Site Reliability Engineer (SRE) to develop, scale, and operate our OpenShift managed cloud services. OpenShift is Red Hat's enterprise Kubernetes distribution. As an SRE you will contribute to running OpenShift at scale by enabling customer self-service, making our monitoring system more sustainable, and eliminating work through automation.

On the SRE team, you will have the opportunity to influence the complex challenges of scale which are unique to Red Hat managed cloud services, while using your skills in coding, operations, and large-scale distributed system design.

Red Hat relies on teamwork and openness for its success. We are a global team and strive to cultivate a transparent environment that makes room for different voices. We learn from our failures in a blameless environment to support the continuous improvement of the team. At Red Hat, your individual contributions have more visibility than most large companies, and visibility means career opportunities and growth.

What You'll Do

The day-to-day responsibilities of an SRE involve working with live systems and coding automation. As an SRE you will be expected to:

Contribute code to increase the scalability and reliability of the service Contribute software tests and participate in peer review to increase the quality of our codebase Help and develop peers' capabilities through knowledge sharing, mentoring, and collaboration Participate in a regular on-call schedule, including occasional paid weekends and holidays Practice sustainable incident response and blameless postmortems Resolve customer issues escalated from the Red Hat Global Support team Work within a small agile team to develop and improve SRE software, support your peers, plan and self-improve

What You'll Bring

Bachelor's degree in Computer Science or related technical field; or equivalent experience Programming experience in at least one of the following languages: Python, Golang, Java, C, C++ or another object-oriented language Experience working with public clouds such as AWS, GCP, or Azure Ability to collaboratively troubleshoot and solve problems in a team setting Experience troubleshooting an as-a-service offering (SaaS, PaaS, etc.) Experience working with complex distributed systems. Direct experience with Kubernetes or OpenShift is a plus. We like to see a demonstrated ability to debug, optimize code and automate routine tasks. We are Red Hat, so you need a basic understanding of Unix/Linux operating systems.

Desired skills

Demonstrated ability to debug, optimize code and automate routine tasks 2+ years of experience programming with at least one object-oriented language; Golang, Java, or Python are preferred 2+ years of experience delivering a hosted service 2+ years of experience managing Linux servers running Red Hat Enterprise Linux (RHEL), CentOS, or Fedora hosted at a cloud provider such as Amazon Web Services (AWS), Google Compute Engine (GCE), or Microsoft Azure 3+ years of experience with enterprise systems monitoring; knowledge of Prometheus is a plus 3+ years of experience with enterprise configuration management software like Ansible by Red Hat, Puppet, or Chef Demonstrated ability to quickly and accurately troubleshoot system issues Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP Solid communications skills and experience working directly with and presenting to customers 1+ year(s) of experience with Kubernetes is a plus 1+ year(s) of experience with docker-based containers is a plus

About Red Hat
is the world's leading provider of enterprise software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates have the flexibility to choose the work environment that suits their needs from in-office to fully remote to office-flex. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact. Opportunities are open. Join us.

Diversity, Equity & Inclusion at Red Hat
Red Hat's culture is built on the open source principles of transparency, collaboration, and inclusion, where the best ideas can come from anywhere and anyone. When this is realized, it empowers people from diverse backgrounds, perspectives, and experiences to come together to share ideas, challenge the status quo, and drive innovation. Our aspiration is that everyone experiences this culture with equal opportunity and access, and that all voices are not only heard but also celebrated. We hope you will join our celebration, and we welcome and encourage applicants from all the beautiful dimensions of diversity that compose our global village.

Equal Opportunity Policy (EEO)
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.


Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.


Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email . General inquiries, such as those regarding the status of a job application, will not receive a reply.



  • Vancouver, British Columbia, Canada Demonware Full time

    Job Title: Senior Site Reliability EngineerCompany: DemonwareWho we are:Demonware is part of the Activision family. We collaborate with engineers and creatives at our partner studios to provide online and data services for our popular game franchises. Our team is known for delivering top-notch services to millions of players daily.About the Role:Be hands-on...


  • Vancouver, British Columbia, Canada Sentry Full time $175,000 - $200,000

    About the role The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1,000,000+ developers. Sentry receives over a billion events a day, and processes...


  • Vancouver, British Columbia, Canada Microsoft Canada Full time

    OverviewAre you an individual who loves to work on large-scale projects at one of the most exciting and diverse divisions within Microsoft? Are you looking for big, creative challenges that show immediate results since your customers are the product engineers for Office and M365? Do you want to be at the core of it all, acting as a force multiplier enabling...


  • Vancouver, British Columbia, Canada Sentry Full time $175,000 - $200,000

    The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1,000,000+ developers. Sentry receives over a billion events a day, and processes terabytes of data...


  • Vancouver, British Columbia, Canada tsworks Full time

    Who We Aretsworks Canada, Inc is a technology products and services company based out of Ontario, Canada. We are a subsidiary of The Software Works, Inc, USA. Our mission is to adopt, challenge and set the best practices in Information Technology. At tsworks Canada Inc, we value our employees, take pride in providing best value in customer engagements, and...


  • Vancouver, British Columbia, Canada Microsoft Full time

    Overview Are you an individual who loves to work on large-scale projects at one of the most exciting and diverse divisions within Microsoft? Are you looking for big, creative challenges that show immediate results since your customers are the product engineers for Office and M365? Do you want to be at the core of it all, acting as a force multiplier...


  • Vancouver, British Columbia, Canada Activision Full time $73,255 - $154,790

    Job Title: Senior Site Reliability Engineer - Demonware Requisition ID: R023059 Job Description:Who we are:Demonware is a member of the Activision family of studios. We work alongside engineers and creatives at our AAA partner studios and deliver the online and data services required by our massive franchises. We have launched well over 100 games, and our...


  • Vancouver, British Columbia, Canada Dapper Labs Full time

    Join Our Team as a Site Reliability Engineer at Dapper LabsWe are seeking a Site Reliability Engineer to be a vital part of an organization revolutionizing the way distributed applications on blockchains can reach large audiences.You will be part of a Site Reliability Engineering team that designs, builds, and enhances resilient, scalable systems. In this...


  • Vancouver, British Columbia, Canada Axiom Zen Full time

    We're looking for a Site Reliability Engineer who wants to be at the technical core of an organization that's completely reshaping how distributed applications on blockchains can reach massive audiences.You will join a Site Reliability Engineering team that has the ability to architect, build, and iterate on resilient, scalable systems.SRE also guides the...


  • Vancouver, British Columbia, Canada Sigmaways Inc Full time

    We're seeking a Site Reliability Engineer to join our team with expertise in Kubernetes and troubleshootingResponsibilities:Monitor, measure, and report alerts, overall health, performance, and capacity of one or more services. Gain deep knowledge and learn the application stack. Ability to deb


  • Vancouver, British Columbia, Canada Sigmaways Inc Full time

    We're seeking a Site Reliability Engineer to join our team with expertise in Kubernetes and troubleshooting.Responsibilities:Monitor, measure, and report alerts, overall health, performance, and capacity of one or more services.Gain deep knowledge and learn the application stack.Ability to debug and optimize code and automate routine tasks.Function well in a...


  • Vancouver, British Columbia, Canada Arista Full time

    Site Reliability Engineer (SRE) - CloudVisionFull-timeArista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined...


  • Vancouver, British Columbia, Canada T-Net British Columbia Full time $56,000

    We are currently seeking a new Site Reliability Engineers to join our Engineering team in Burnaby, Calgary or Toronto . As a Site Reliability Engineer you will be creating, improving, and maintaining a globally distributed mesh of Clio service regions that acts as a foundation for each of our clients across the world to excel within their respective...


  • Vancouver, British Columbia, Canada T-Net British Columbia Full time

    We are currently seeking a new Site Reliability Engineers to join our Engineering team in Burnaby, Calgary or Toronto.Applicants should be available for an 8-month co-opperiod from September 2024 to April 2025. We will be accepting applications throughout JuneWhat your team does:As a Site Reliability Engineer you will be creating, improving, and maintaining...


  • Vancouver, British Columbia, Canada Flexton Inc. Full time

    Location: Vancouver, CanadaWith over 5 years in the field; Solid experience with AWS---specializes in creating and managing scalable cloud-based infrastructures; Comes from a Software Engineering background---proficient in languages like Python, Javascript, Bash, and more;Extensive knowledge of infrastructure as code (IaC) tools such as Terraform, GHA,...


  • Vancouver, British Columbia, Canada T-Net British Columbia Full time $3,600 - $4,500

    Site Reliability Engineer Co-op (Sept May 2025) Job Overview Visier is the leader in people analytics and we believe in a 'people-first' approach to business strategy. Our innovative technology transforms the way that organisations make decisions, allowing them to elevate their employees and drive better business outcomes. Embarking on an exciting new...


  • Vancouver, British Columbia, Canada Arista Full time $100,000 - $140,000

    Site Reliability Engineer (SRE) - CloudVision Full-time Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined...


  • Vancouver, British Columbia, Canada HOVER SENIOR LIVING COMMUNITY Full time

    Remote Full Time Data The Senior ML Infrastructure Engineer role focuses on building and maintaining the infrastructure and compute platform central to all technical operations at Cohere, ensuring its stability, scalability, and observability. The role involves tackling complex technical challenges and participating in a 24x7 on-call rotation. Disclaimer:...


  • Vancouver, British Columbia, Canada Arista Networks Full time

    Job Description Who You'll Work With SREs at Arista combine strong software and systems engineering with a passion for operating production systems at scale. As an SRE you'll be part of the team responsible for our global service fleet. What You'll Do As an SRE you'll be responsible for our global CloudVision service fleet. This includes: Building...


  • Vancouver, British Columbia, Canada MatchaTalent Full time

    Job Description:Company Overview:This company specializes in the exploration, production, and sale of crude oil and natural gas. With operations in various segments, including Upstream, Downstream, and Corporate, the company has been a key player in the industry since its establishment in 1933.Job Title: Reliability/Mechanical EngineerAbout the Role:As a...