Site Reliability Engineer
2 weeks ago
Title: Site Reliability Engineer (SRE)
Location:Bangalore, Karnataka, IN, 560071
Requisition ID: 127074Job Summary
As a Site Reliability Engineer (SRE) with a specialization in storage, you'll manage and optimize a portfolio of customer-facing cloud services (SaaS/IaaS) on Google Cloud Platform (GCP), ensuring their overall availability, performance, and security. You will collaborate closely with global teams from NetApp and GCP, with a primary focus on supporting Google Cloud NetApp Volumes. This position includes rotational on-call work as part of a global team due to the critical nature of the services we support.
You will be working in a dynamic and fast-paced environment as an engineer on the Site Reliability Engineering (SRE) team. This team is responsible for assisting customers of Google Cloud NetApp Volumes in resolving complex technical issues in production environments. We are seeking an SRE with a deep understanding of storage systems, complex distributed systems, and cloud technologies, and the ability to articulate these concepts clearly to customers and fellow engineers.
You will work with your teammates and our customers to support innovative, cutting-edge technologies that address real-world challenges. You will provide valuable feedback and guidance to our Product and Engineering teams while representing the voice of our customers. You have the opportunity to make a significant impact and take real ownership of your work.
Job Requirements
o Collaborate with external customers and partners to ensure their success with Google Cloud NetApp Volumes.
o Respond to, troubleshoot, and drive root cause analysis (RCA) of complex live production incidents, including cross-platform issues involving OS, networking, and databases in cloud-based SaaS/IaaS environments by following and implementing SRE best practices.
o Continuously monitor, analyze, and measure system health, availability, and latency using tools like Prometheus, Google Cloud Monitoring, ElasticSearch, Grafana, and SolarWinds. Develop and implement steps to improve system and application performance, availability, and reliability.
o Document system knowledge, create runbooks, and ensure critical system information is readily available.
o Stay up-to-date with security trends and proactively identify, diagnose, and resolve complex security issues.
o Maintain and monitor deployment, orchestration of servers, Docker containers, databases, and general backend infrastructure.
o Automate tasks and system components that would benefit from automation or are performed manually.
o Utilize Atlassian Jira to track issues to resolution based on their priority.
o Engage in incident management processes and resolve issues within agreed SLAs/SLOs.
o Extensive experience in storage technologies and incident management processes.
o Advanced knowledge of Linux operating systems (e.g., Ubuntu, CentOS).
o Proficiency in container-based architecture (e.g., Kubernetes).
o Intermediate to advanced knowledge of automation tools and scripting languages such as Ansible, Python, Bash, Go, and PowerShell.
o Solid understanding of algorithms, data structures, and databases (SQL/NoSQL).
o Intermediate knowledge of networking concepts.
o Hands-on experience with cloud environments, particularly GCP.
o Exceptional debugging skills across various platforms and technologies.
o Familiarity with site reliability engineering principles and best practices.
Education
BE in Computer Science or a related field, or 6+ years of professional experience in a relevant role.
Job Segment: Cloud, Software Engineer, Database, Computer Science, Linux, Technology, Engineering
-
Site Reliability Engineer
3 weeks ago
Vancouver, British Columbia, Canada Electronic Arts Full timeResponsibilitiesWe are seeking a skilled Site Reliability Engineer to join our team at Electronic Arts. As a Site Reliability Engineer, you will work closely with our development teams to address build issues and improve our systems.Key ResponsibilitiesCollaborate with development teams to identify and resolve build issuesCreate and maintain pipelines and...
-
Senior Site Reliability Engineer
3 weeks ago
Vancouver, British Columbia, Canada Royal Bank of Canada> Full timeJob SummaryThe Royal Bank of Canada is seeking a skilled Site Reliability Engineering Specialist to join its team. This role will be responsible for the support, development, and implementation of Site Reliability Engineering solutions for all applications within the bank's technology infrastructure.Key ResponsibilitiesSupport and Development of Site...
-
Site Reliability Engineer- Automation
4 weeks ago
Vancouver, Canada Themis Solutions Inc. Full timep>We are currently seeking a new Site Reliability Engineer, Co-op, to join our Engineering team in Burnaby, Calgary or Toronto.Applicants should be available for an 8-month co-op period from January 2025 to August 2025.What your team does:As a Site Reliability Engineer, you will help build, improve, and maintain Clio’s globally distributed network of...
-
Lead Site Reliability Engineer
3 months ago
Vancouver, Canada Royal Bank of Canada Full timeJob SummaryThe Lead Support SRE will be responsible for the supporting and spearheading the development, and implementation of Site Reliability Engineering solutions for all applications within City National Bank (CNB), an RBC company. This team will work collaboratively with teams across several li
-
Senior Site Reliability Engineer
3 months ago
Vancouver, Canada Royal Bank of Canada Full timeJob SummaryThe Application Support SRE will be responsible for the support, development, and implementation of Site Reliability Engineering solutions for all applications within City National Bank (CNB), an RBC company. This team will work collaboratively with teams across several lines of business
-
Site Reliability Engineer II
6 months ago
Vancouver, Canada Microsoft Full timeOverview Are you an individual who loves to work on large-scale projects at one of the most exciting and diverse divisions within Microsoft? Are you looking for big, creative challenges that show immediate results since your customers are the product engineers for Office and M365? Do you want to be at the core of it all, acting as a force multiplier...
-
Site Reliability Specialist
3 weeks ago
Vancouver, British Columbia, Canada Perlego Full timeAbout the RoleWe are currently seeking a highly skilled Site Reliability Engineer to join our team at Perlego. As a Site Reliability Engineer, you will play a critical role in ensuring the availability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly available cloud-based...
-
Senior Site Reliability Engineer
2 days ago
Vancouver, Canada Microsoft Canada Full timeMicrosoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world. Microsoft’s Azure Data engineering team is leading the transformation of analytics in the world of...
-
Senior Site Reliability Engineer
2 days ago
Vancouver, Canada Microsoft Canada Full timeMicrosoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world.Microsoft’s Azure Data engineering team is leading the transformation of analytics in the world of data...
-
Senior Site Reliability Engineer
2 days ago
Vancouver, Canada Microsoft Canada Full timeAre you interested in working for one of the most exciting teams at Microsoft? Then look no further than Microsoft Teams SRE team. You will be building solutions that leverage state-of-the-art technologies to deliver the next evolution in collaboration and teamwork. What is a Site Reliability Engineer (SRE)? SRE is what you get when you treat operations as...
-
Senior Site Reliability Engineer
2 days ago
Vancouver, Canada Microsoft Canada Full timeAre you interested in working for one of the most exciting teams at Microsoft? Then look no further than Microsoft Teams SRE team. You will be building solutions that leverage state-of-the-art technologies to deliver the next evolution in collaboration and teamwork. What is a Site Reliability Engineer (SRE)? SRE is what you get when you treat operations as...
-
Site Reliability Engineer- Automation
2 months ago
Vancouver, Canada Arista Full timeh3>Site Reliability Engineer (SRE) - CloudvisionFull-timeArista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined...
-
Senior Site Reliability Engineer
2 days ago
Vancouver, Canada RBC Full timeJob Summary The Application Support SRE will be responsible for the support, development, and implementation of Site Reliability Engineering solutions for all applications within City National Bank (CNB), an RBC company. This team will work collaboratively with teams across several lines of business and other Technology and Operations partners as a...
-
AWS Site Reliability Engineer
2 months ago
Vancouver, Canada TrustFlight Full timep>TrustFlight is at the forefront of digitizing the aviation industry with the creation of intelligent workflow applications that automate operating and maintenance processes, enabling our customers to focus on the data and insights that matter. We continue to build an amazing group of people who are all here to make our products, services and culture the...
-
Site Reliability Leader
2 weeks ago
Vancouver, British Columbia, Canada Royal Bank of Canada> Full timeJob SummaryThe Royal Bank of Canada seeks a skilled Site Reliability Engineer to lead the development and implementation of SRE solutions for all applications within the organization. This role requires collaboration with cross-functional teams to ensure successful delivery of technology solutions.Key ResponsibilitiesDevelop and maintain production support...
-
Senior Site Reliability Engineer
2 days ago
Vancouver, Canada RBC Full timeJob Summary The Application Support SRE will be responsible for the support, development, and implementation of Site Reliability Engineering solutions for all applications within City National Bank (CNB), an RBC company. This team will work collaboratively with teams across several lines of business and other Technology and Operations partners as a...
-
Highly Skilled Site Reliability Engineer
2 weeks ago
Vancouver, British Columbia, Canada Royal Bank of Canada Full timeCompany OverviewThe Royal Bank of Canada (RBC) is a leading financial institution that prides itself on providing exceptional banking services to its clients. With a strong presence in the Canadian market, RBC has a reputation for innovation and customer satisfaction.SalaryWe are offering a highly competitive salary range of $120,000 - $180,000 per year,...
-
Vancouver, British Columbia, Canada S.i. Systems Full timeJob Description:We are seeking a Senior Site Reliability Engineer to develop robust observability solutions using Dynatrace and automate key monitoring processes through Terraform and PowerShell.Key Responsibilities:• Develop and implement observability solutions using Dynatrace• Automate key monitoring processes through Terraform and PowerShellAbout the...
-
DevOps Engineer
2 months ago
Vancouver, Canada Azad Technology Partners Full timep>AZAD Technology Partners is seeking a Site Reliability Engineer/ Devops Engineer for a full-time, W2 Contract position based in Chicago, IL.Schedule: Full-time, 40 hours/week, HybridAssignment Duration: 10 Months.AZAD Technology Partners is committed to Diversity, Equity & Inclusion and is striving to build an even more diverse, inclusive team that...
-
Lead Site Reliability Engineer
2 days ago
Vancouver, Canada RBC Full timeJob Summary The Lead Support SRE will be responsible for the supporting and spearheading the development, and implementation of Site Reliability Engineering solutions for all applications within City National Bank (CNB), an RBC company. This team will work collaboratively with teams across several lines of business and other Technology and Operations...