Site Reliability Engineer

2 weeks ago


Vancouver, Canada Arista Networks Full time

Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus, and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined networking to provide our clients with a competitive edge in an increasingly interconnected world. Our solutions are designed to meet current digital demands and anticipate future challenges.

At Arista, we value the diversity of thought and perspectives that each employee brings. We believe fostering an inclusive environment, where individuals from various backgrounds and experiences feel welcome, is essential for driving creativity and innovation.

Our commitment to excellence has earned us prestigious awards, such as Best Engineering Team, Best Company for Diversity, Compensation, and Work-Life Balance. We pride ourselves on our track record of success and strive to uphold the highest standards of quality and performance.

Job Description

Who You’ll Work With

SREs at Arista combine strong software and systems engineering skills with a passion for operating production systems at scale. As an SRE, you’ll be part of the team responsible for our global service fleet.

What You’ll Do

As an SRE, you’ll be responsible for our global CloudVision service fleet, including:

- Building the CI/CD lifecycle for services, from inception and design to deployment and scaling
- Improving operational processes through automation
- Identifying key service indicators for capacity planning
- Owning disaster recovery and management
- Designing infrastructure and cloud-based application security
- Leading incident response and blameless postmortems
- Participating actively in our distributed on-call team

CloudVision is an enterprise network management and streaming telemetry SaaS, deployed on Kubernetes across regions using Spinnaker for CI/CD. Our tech stack includes GKE, HBase/Hadoop, ElasticSearch, ClickHouse, Kafka, and TensorFlow, with monitoring built on Prometheus, Grafana, Loki, and other OSS tools.

Qualifications

- BS/MS degree in Computer Science or relevant experience
- 5+ years of software engineering experience
- Experience with deploying distributed database systems or scale-out SaaS applications

Compensation

The salary range for this role is $95,000 to $145,000, with pay based on location, skills, experience, and qualifications. Additional benefits include bonuses, equity, and comprehensive health plans. Details will be shared during the hiring process.

#J-18808-Ljbffr



  • Vancouver, Canada BNB Chain Full time

    LayerZero The Future is Omnichain. Founded in 2021, LayerZero’s vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities. With LayerZero's simple, generic messaging protocol, builders will develop cross-chain dApps designed to unify the power of individual blockchains. We...


  • Vancouver, Canada Apple Inc. Full time

    Vancouver, British Columbia, Canada Software and Services The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to...


  • Vancouver, Canada Apple Inc. Full time

    Vancouver, British Columbia, Canada Software and Services The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to...


  • Vancouver, Canada Apple Inc. Full time

    Vancouver, British Columbia, Canada Software and Services The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to...


  • Vancouver, Canada Vancity Full time

    Site Reliability Engineer Join Vancity, a member‑owned credit union committed to inclusion and social justice. Our cloud‑first strategy spans digital banking, core banking, data platforms, and member‑facing services. We pride ourselves on being the largest private‑sector Living Wage Employer in Canada and a Top Employer nationwide. Your Role in...


  • Vancouver, Canada Vancity Full time

    Site Reliability Engineer Join Vancity, a member‑owned credit union committed to inclusion and social justice. Our cloud‑first strategy spans digital banking, core banking, data platforms, and member‑facing services. We pride ourselves on being the largest private‑sector Living Wage Employer in Canada and a Top Employer nationwide. Your Role in...


  • Vancouver, Canada Regie Full time

    We’re seeking a senior Site Reliability Engineer/DevOps who is passionate about building the best infrastructure and maintaining the health of the systems.Design and maintain scalable, secure, and reliable infrastructure to support Regie.ai's SaaS platform and AI/data workloads.Architect a unified monitoring and alerting system for engineering teams to...


  • Vancouver, Canada Cerebras Full time

    ResponsibilitiesWe’re seeking a senior Site Reliability Engineer/DevOps who is passionate about building the best infrastructure and maintaining the health of the systems.Design and maintain scalable, secure, and reliable infrastructure to support Regie.ai's SaaS platform and AI/data workloads.Architect a unified monitoring and alerting system for...


  • Vancouver, Canada Vancity Full time

    Our Story & Purpose:We’re Vancity, a member-owned credit union built on the principles of inclusion and social justice. Since 1946, our relentless commitment to these values has helped us challenge the status quo and break down barriers. We’ve made bold commitments to become net-zero by 2040 across all mortgages and loans, and we’re actively pursuing...


  • Vancouver, Canada Cerebras Full time

    Responsibilities We’re seeking a senior Site Reliability Engineer/DevOps who is passionate about building the best infrastructure and maintaining the health of the systems. Design and maintain scalable, secure, and reliable infrastructure to support Regie.ai's SaaS platform and AI/data workloads. Architect a unified monitoring and alerting system for...