Site Reliability Engineer
2 weeks ago
Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus, and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined networking to provide our clients with a competitive edge in an increasingly interconnected world. Our solutions are designed to meet current digital demands and anticipate future challenges.
At Arista, we value the diversity of thought and perspectives that each employee brings. We believe fostering an inclusive environment, where individuals from various backgrounds and experiences feel welcome, is essential for driving creativity and innovation.
Our commitment to excellence has earned us prestigious awards, such as Best Engineering Team, Best Company for Diversity, Compensation, and Work-Life Balance. We pride ourselves on our track record of success and strive to uphold the highest standards of quality and performance.
Job Description
Who You’ll Work With
SREs at Arista combine strong software and systems engineering skills with a passion for operating production systems at scale. As an SRE, you’ll be part of the team responsible for our global service fleet.
What You’ll Do
As an SRE, you’ll be responsible for our global CloudVision service fleet, including:
- Building the CI/CD lifecycle for services, from inception and design to deployment and scaling
- Improving operational processes through automation
- Identifying key service indicators for capacity planning
- Owning disaster recovery and management
- Designing infrastructure and cloud-based application security
- Leading incident response and blameless postmortems
- Participating actively in our distributed on-call team
CloudVision is an enterprise network management and streaming telemetry SaaS, deployed on Kubernetes across regions using Spinnaker for CI/CD. Our tech stack includes GKE, HBase/Hadoop, ElasticSearch, ClickHouse, Kafka, and TensorFlow, with monitoring built on Prometheus, Grafana, Loki, and other OSS tools.
Qualifications
- BS/MS degree in Computer Science or relevant experience
- 5+ years of software engineering experience
- Experience with deploying distributed database systems or scale-out SaaS applications
Compensation
The salary range for this role is $95,000 to $145,000, with pay based on location, skills, experience, and qualifications. Additional benefits include bonuses, equity, and comprehensive health plans. Details will be shared during the hiring process.
#J-18808-Ljbffr
-
Site Reliability Engineer
1 day ago
Vancouver, Canada BNB Chain Full timeLayerZero The Future is Omnichain. Founded in 2021, LayerZero’s vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities. With LayerZero's simple, generic messaging protocol, builders will develop cross-chain dApps designed to unify the power of individual blockchains. We...
-
Site Reliability Engineer
2 weeks ago
Vancouver, Canada Apple Inc. Full timeVancouver, British Columbia, Canada Software and Services The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to...
-
Site Reliability Engineer
2 weeks ago
Vancouver, Canada Apple Inc. Full timeVancouver, British Columbia, Canada Software and Services The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to...
-
Site Reliability Engineer
1 day ago
Vancouver, Canada Apple Inc. Full timeVancouver, British Columbia, Canada Software and Services The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to...
-
Site Reliability Engineer
3 weeks ago
Vancouver, Canada Vancity Full timeSite Reliability Engineer Join Vancity, a member‑owned credit union committed to inclusion and social justice. Our cloud‑first strategy spans digital banking, core banking, data platforms, and member‑facing services. We pride ourselves on being the largest private‑sector Living Wage Employer in Canada and a Top Employer nationwide. Your Role in...
-
Site Reliability Engineer
6 days ago
Vancouver, Canada Vancity Full timeSite Reliability Engineer Join Vancity, a member‑owned credit union committed to inclusion and social justice. Our cloud‑first strategy spans digital banking, core banking, data platforms, and member‑facing services. We pride ourselves on being the largest private‑sector Living Wage Employer in Canada and a Top Employer nationwide. Your Role in...
-
Senior Site Reliability Engineer
6 days ago
Vancouver, Canada Regie Full timeWe’re seeking a senior Site Reliability Engineer/DevOps who is passionate about building the best infrastructure and maintaining the health of the systems.Design and maintain scalable, secure, and reliable infrastructure to support Regie.ai's SaaS platform and AI/data workloads.Architect a unified monitoring and alerting system for engineering teams to...
-
Senior Site Reliability Engineer
5 days ago
Vancouver, Canada Cerebras Full timeResponsibilitiesWe’re seeking a senior Site Reliability Engineer/DevOps who is passionate about building the best infrastructure and maintaining the health of the systems.Design and maintain scalable, secure, and reliable infrastructure to support Regie.ai's SaaS platform and AI/data workloads.Architect a unified monitoring and alerting system for...
-
Site Reliability Engineer
3 days ago
Vancouver, Canada Vancity Full timeOur Story & Purpose:We’re Vancity, a member-owned credit union built on the principles of inclusion and social justice. Since 1946, our relentless commitment to these values has helped us challenge the status quo and break down barriers. We’ve made bold commitments to become net-zero by 2040 across all mortgages and loans, and we’re actively pursuing...
-
Senior Site Reliability Engineer
2 weeks ago
Vancouver, Canada Cerebras Full timeResponsibilities We’re seeking a senior Site Reliability Engineer/DevOps who is passionate about building the best infrastructure and maintaining the health of the systems. Design and maintain scalable, secure, and reliable infrastructure to support Regie.ai's SaaS platform and AI/data workloads. Architect a unified monitoring and alerting system for...