Site Reliability Engineer
2 weeks ago
Vancouver, British Columbia, Canada Software and Services
The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to build and run large-scale, massively distributed, fault-tolerant systems. Our software ensures that Apple’s services are reliable, scalable and secure, and we leverage both open source and home‑grown technologies to provide managed data infrastructure services. You will help build next‑generation search infrastructure and platform services, collaborating cross‑functionally with various ASE teams, from store and commerce to search and recommendations. You’ll create platforms that can rapidly scale to serve personalized and non‑personalized data with very low latencies. You should be someone who is not afraid to question assumptions, are a good standout colleague under tight deadlines, and can take on problems with elegant technical solutions.
Description
The ASE SRE team develops applications and tooling that are safe, reliable, scalable, and fast. Our Data Reliability Engineering team is responsible for all aspects of managing Voldemort key‑value distributed database infrastructure deployment on on‑premise bare metal and public cloud platforms, including maintenance, deployment automation, backup, observability and telemetry, with focus on reliability, performance, and scaling to deliver continuous data store availability to ASE Media Applications. Success in this role requires expertise in several of the following: Understanding of core SRE concepts—Monitoring, Alerting, Incident management—Performance engineering (design concepts, profile‑guided optimization) —Service management across bare metal, and virtualized (EC2) platforms—Prepare alert handling procedures, run‑books, and collaborate with other SRE team members—Excellent communication and a high degree of customer focus when engaging with internal platform customers—As a distributed team, ability to work optimally with colleagues based in other locations is also essential; experience in this area is a plus—Prior experience with development or maintenance of distributed databases, and operating systems is recommended. Come join us at Apple Services Engineering and help us deliver services and applications that are fluid and responsive. You will collaborate with engineers from across Apple to define the metrics, set targets, uncover optimization opportunities, and ship a service that will delight our customers. This role is for engineers who enjoy deep technical engineering that spans large cross‑organizational projects. Your openness to learning and implementing new technologies will contribute to the continuous evolution of our organization. Good ideas are valued and rewarded.
Minimum Qualifications
- Success in this role requires expertise in several of the following:
- BS/MS in Computer Science or Equivalent
- At least 2-5 years in a Reliability Engineering, DevOps or infrastructure focused role
- Support of internet-facing production services and distributed systems via deployments, onCall and Incident Management.
- Understanding of distributed database concepts (consistency models, isolation levels, crash and recovery semantics).
- Performance engineering (design concepts, profile‑guided optimization).
- Datacenter architecture (networking topologies, host placement strategies, and failure modes); design of multi-datacenter systems; failure domains; and wide‑area networking.
- Automation advocate - prior history of removing operational toil via software.
- Self motivated, inquisitive and always looking to learn more.
Preferred Qualifications
- Demonstrated expertise developing distributed systems, storage engines, distributed systems, or performance engineering.
- Experience developing critical internet services and/or platform infrastructure.
- Proficient in one or more of the following programming languages: Java, Go (golang), Python
- Optional experience with EC2, EBS, and Terraform
At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $116,800 and $226,000, and your base pay will depend on your skills, qualifications, experience, and location.
Apple employees also have the opportunity to become an Apple shareholder through participation in Apple Inc.’s discretionary employee stock programs. Employees are eligible for discretionary restricted stock unit award recommendations, and can purchase Apple Inc. stock at a discount if voluntarily participating in Apple Inc.’s Employee Stock Purchase Plan. Participation in Apple Inc.’s discretionary stock programs is governed by Apple Inc.’s stock plans and agreements and are not part of local employment contracts or compensation.
You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits.
Note: Apple benefit and compensation programs are subject to eligibility requirements and other terms of the applicable plan or program.
Apple is an equal opportunity employer that is committed to inclusion and diversity. Apple provides reasonable accommodations to applicants with disabilities. Apple is a drug‑free workplace.
#J-18808-Ljbffr
-
Site Reliability Engineer
4 weeks ago
Vancouver, Canada LayerZero Labs Full timeJoin to apply for the Site Reliability Engineer role at LayerZero Labs Join to apply for the Site Reliability Engineer role at LayerZero Labs Get AI-powered advice on this job and more exclusive features. The Future is Omnichain. LayerZeroThe Future is Omnichain. Founded in 2021, LayerZero’s vision is to create a community of cross-chain developers,...
-
Site Reliability Engineer
4 weeks ago
Vancouver, Canada Arbitrum Full timeLayerZero The Future is Omnichain. Founded in 2021, LayerZero’s vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities. With LayerZero's simple, generic messaging protocol, builders will develop cross-chain dApps designed to unify the power of individual blockchains. We...
-
Site Reliability Engineer
2 weeks ago
Vancouver, British Columbia, Canada LayerZero Labs Full timeLayerZeroThe Future is Omnichain.Founded in 2021, LayerZero's vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities. With LayerZero's simple, generic messaging protocol, builders will develop cross-chain dApps designed to unify the power of individual blockchains.We are...
-
Site Reliability Engineer
4 weeks ago
Vancouver, Canada LayerZero Labs Full timeFounded in 2021, LayerZero’s vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities. With LayerZero's simple, generic messaging protocol, builders will develop cross-chain dApps designed to unify the power of individual blockchains. We are funded by the best investors...
-
Site Reliability Engineer
2 weeks ago
Vancouver, British Columbia, Canada Apple Full timeThe Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to build and run large-scale, massively distributed,...
-
Site Reliability Engineer
2 weeks ago
Vancouver, Canada Apple Inc. Full timeVancouver, British Columbia, Canada Software and Services The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to...
-
Site Reliability Engineer
2 weeks ago
Vancouver, Canada Vancity Full timeSite Reliability Engineer Join Vancity, a member‑owned credit union committed to inclusion and social justice. Our cloud‑first strategy spans digital banking, core banking, data platforms, and member‑facing services. We pride ourselves on being the largest private‑sector Living Wage Employer in Canada and a Top Employer nationwide. Your Role in...
-
Site Reliability Engineer
3 days ago
Vancouver, Canada Vancity Full timeSite Reliability Engineer Join Vancity, a member‑owned credit union committed to inclusion and social justice. Our cloud‑first strategy spans digital banking, core banking, data platforms, and member‑facing services. We pride ourselves on being the largest private‑sector Living Wage Employer in Canada and a Top Employer nationwide. Your Role in...
-
Senior Site Reliability Engineer
3 days ago
Vancouver, Canada Regie Full timeWe’re seeking a senior Site Reliability Engineer/DevOps who is passionate about building the best infrastructure and maintaining the health of the systems.Design and maintain scalable, secure, and reliable infrastructure to support Regie.ai's SaaS platform and AI/data workloads.Architect a unified monitoring and alerting system for engineering teams to...
-
Senior Site Reliability Engineer
1 day ago
Vancouver, Canada Cerebras Full timeResponsibilitiesWe’re seeking a senior Site Reliability Engineer/DevOps who is passionate about building the best infrastructure and maintaining the health of the systems.Design and maintain scalable, secure, and reliable infrastructure to support Regie.ai's SaaS platform and AI/data workloads.Architect a unified monitoring and alerting system for...