L3 Site Reliability Engineer
17 hours ago
L3 Site Reliability Engineer - Linux, Automation (Ansible), IaC (Terraform), Zookeeper) Position Description: The Core Services L3 support team is part of the Enterprise Computing Data Services Organization. The team manages and supports a variety of applications developed in-house for purposes like application management and application coordination using Apache Zookeeper, API Proxy, Automation Platform using Ansible Automation Platform and Infrastructure as Code using Terraform. It serves as the highest level of escalation, and actively engages engineering teams that develop the products and tooling to maintain service stability. This position is a Level 3 support and SRE role with global responsibility for managing and providing support for these middleware products with on call coverage to handle production escalations. The successful candidate will be involved in day-to-day management of the infrastructure environment, troubleshooting with users, handling changes, incidents, escalations, and problem management. The person would also routinely work with engineering teams that developed these products to resolve problems and proactively automate operational and user processes to reduce toil and time to market. Required Skills 8+ years of overall IT experience. Advanced Linux / Unix support experience. Strong shell scripting and Python programming skills for SRE related activities. Experience using Splunk OR Grafana/Prometheus/Loki stack, preferably both. General understanding of Veritas Cluster Service, Load Balancers, and VMware. Knowledge of ITIL principles. Effective oral and written communication skills, and interpersonal skills to work well in a team environment. Strong organizational and coordination skills with the ability to manage multiple tasks and high-pressure situations for outage handling, management, or resolution. Availability for weekend work. Desired Skills Experience in application support, code release and liaison with development teams. Experience with automation using Ansible playbooks. Experience with Ansible Automation Platform administration. Experience with Terraform, especially Terraform Enterprise. Knowledge of Docker, Kubernetes/OpenShift. Experience in development tool chain such as Git, Bitbucket and CI/CD tools. Experience in Agile methodologies. Good knowledge of JVMs and garbage collection mechanisms. Experience with relational databases. Seniority Level Associate Employment Type Full-time Job Function Information Technology Industries IT Services and IT Consulting #J-18808-Ljbffr
-
Site Reliability Engineer
2 weeks ago
Montreal, Canada LanceSoft, Inc. Full timeDirect message the job poster from LanceSoft, Inc. Site Reliability Engineer Job Title: Site Reliability Engineer Experience Level: Level 4 (advanced): 7-15 years Location: Montreal (Day 1 onboarding onsite / in office presence 3x week) Duration: 12+ months contract Primary Responsibilities: Provide L3 support for ***'s private cloud, including on-call...
-
Site Reliability Engineer
2 weeks ago
Montreal, Canada LanceSoft, Inc. Full timeDirect message the job poster from LanceSoft, Inc. Site Reliability Engineer Job Title: Site Reliability Engineer Experience Level: Level 4 (advanced): 7-15 years Location: Montreal (Day 1 onboarding onsite / in office presence 3x week) Duration: 12+ months contract Primary Responsibilities: Provide L3 support for ***'s private cloud, including on-call...
-
Site Reliability Engineer
2 weeks ago
Montreal, Canada LanceSoft, Inc. Full timeDirect message the job poster from LanceSoft, Inc. Site Reliability Engineer Job Title: Site Reliability Engineer Experience Level: Level 4 (advanced): 7-15 years Location: Montreal (Day 1 onboarding onsite / in office presence 3x week) Duration: 12+ months contract Primary Responsibilities: Provide L3 support for ***'s private cloud, including on-call...
-
Site Reliability Engineer
2 weeks ago
Montreal, Canada LanceSoft, Inc. Full timeJob Title: Site Reliability EngineerExperience Level: Level 4 (advanced): 7-15 yearsLocation: Montreal (Day 1 onboarding onsite / in office presence 3x week)Duration: 12+ months contractPrimary Responsibilities:Provide L3 support for ***'s private cloud, including on-call rotationWork closely with the internal engineering team and provide input on testing of...
-
Site Reliability Engineer
2 weeks ago
Montreal, Canada LanceSoft, Inc. Full timeJob Title: Site Reliability EngineerExperience Level: Level 4 (advanced): 7-15 yearsLocation: Montreal (Day 1 onboarding onsite / in office presence 3x week)Duration: 12+ months contractPrimary Responsibilities:Provide L3 support for ***'s private cloud, including on-call rotationWork closely with the internal engineering team and provide input on testing of...
-
Site Reliability Engineer
3 weeks ago
Montreal, Canada LanceSoft, Inc. Full timeJob Title: Site Reliability EngineerExperience Level: Level 4 (advanced): 7-15 yearsLocation: Montreal (Day 1 onboarding onsite / in office presence 3x week)Duration: 12+ months contractPrimary Responsibilities:Provide L3 support for ***'s private cloud, including on-call rotationWork closely with the internal engineering team and provide input on testing of...
-
Site Reliability Engineer
2 weeks ago
Montreal, Canada LanceSoft, Inc. Full timeJob Title: Site Reliability Engineer Experience Level: Level 4 (advanced): 7-15 years Location: Montreal (Day 1 onboarding onsite / in office presence 3x week) Duration: 12+ months contract Primary Responsibilities: Provide L3 support for ***'s private cloud, including on-call rotation Work closely with the internal engineering team and provide input on...
-
Site Reliability Engineer
2 weeks ago
Montreal, Canada LanceSoft, Inc. Full timeJob Title: Site Reliability Engineer Experience Level: Level 4 (advanced): 7-15 years Location: Montreal (Day 1 onboarding onsite / in office presence 3x week) Duration: 12+ months contract Primary Responsibilities: Provide L3 support for ***'s private cloud, including on-call rotation Work closely with the internal engineering team and provide input on...
-
Site Reliability Engineer
3 weeks ago
Montreal, Canada LanceSoft, Inc. Full timeJob Title: Site Reliability Engineer Experience Level: Level 4 (advanced): 7-15 years Location: Montreal (Day 1 onboarding onsite / in office presence 3x week) Duration: 12+ months contract Primary Responsibilities: - Provide L3 support for ***'s private cloud, including on-call rotation - Work closely with the internal engineering team and provide...
-
Site Reliability Engineer
4 days ago
Montreal, Canada LanceSoft, Inc. Full timeJob Title: Site Reliability Engineer Experience Level: Level 4 (advanced): 7-15 years Location: Montreal (Day 1 onboarding onsite / in office presence 3x week) Duration: 12+ months contract Primary Responsibilities: Provide L3 support for ***'s private cloud, including on-call rotation Work closely with the internal engineering team and provide input on...