Senior Site Reliability Engineering Specialist
2 weeks ago
We're seeking someone to join our EC Modern Infra Platforms team as a Senior Site Reliability Engineering Specialist in Enterprise Computing to lead SRE optimization effort across multiple infrastructure teams Modern Container Platforms at Morgan Stanley across On Prem and Public Cloud environment to drive ongoing optimization & automation. In the Technology division, we leverage innovation to build the connections and capabilities that power our Firm, enabling our clients and colleagues to redefine markets and shape the future of our communities. This is a Infrastructure Production Management & Reliability Engineering position at Vice-President level, which is part of the job family responsible for maintaining the stability and reliability of the organization's infrastructure systems, ensuring optimal performance and availability to support business operations. Since 1935, Morgan Stanley is known as a global leader in financial services, always evolving and innovating to better serve our clients and our communities in more than 40 countries around the world. Interested in joining a team that’s eager to create, innovate and make an impact on the world? Read on… What you'll do in the role: Observability & Monitoring - Review monitoring, logging and alerting capabilities to gain insight on system health/performance and drive and report key SLI metrics. Lead effort on noise reduction, TOIL management & capacity planning. Architectural Consultation - Collaborate with engineering teams during design phase to bake reliability and efficiency into new services from start. Knowledge Sharing and Mentoring - guide wider SRE and application teams on best practices, shared responsibility, observability and system tuning. Collaborate with business application users to understand their observability requirements and provide solution/training in a consistent manner across the firm. Cost Optimization - Perform ongoing review of manual tasks and drive cross team automation and systematic solution to reduce TOIL. Review resource utilization to optimize spending without impacting performance metrics or headroom. What you'll bring to the role: Bachelor’s degree in Computer Science or equivalent 5+ years of relevant experience in SRE, DevOps or Infrastructure focused software role 3+ years of experience in Kubernetes 3+ years of experience managing observability tools like Prometheus, Grafana, PagerDuty Deep knowledge of container orchestration technologies Proficiency with Infrastructure as Tool using Ansible or Terraform Strong scripting experience using Python Strong Analytical, Problem Solving and collaboration skills Experience with performance / capacity management practices At Morgan Stanley Montreal, we support the Firm’s global businesses and infrastructure with cutting edge technology and innovation. The multi-faceted and highly technical Montreal team plays a critical role in building and maintaining our leading technology platform, including electronic trading, algorithm trading, cloud engineering, infrastructure, cybersecurity and AI/ML. Morgan Stanley has been rooted in the Montreal community since 2008 and is considered a leading employer among the area’s highly skilled technology talent. There’s ample opportunity to move across the businesses for those who show passion and grit in their work. All our positions are located in Montreal, Quebec. We offer a hybrid work environment, combining remote work and attendance in the office. Knowledge of French and English is required. Build a career with impact. Visit morganstanley.com for more information. WHAT YOU CAN EXPECT FROM MORGAN STANLEY: We are committed to maintaining the first-class service and high standard of excellence that have defined Morgan Stanley for over 89 years. Our values - putting clients first, doing the right thing, leading with exceptional ideas, committing to diversity and inclusion, and giving back - aren’t just beliefs, they guide the decisions we make every day to do what's best for our clients, communities and more than 80,000 employees in 1,200 offices across 42 countries. At Morgan Stanley, you’ll find an opportunity to work alongside the best and the brightest, in an environment where you are supported and empowered. Our teams are relentless collaborators and creative thinkers, fueled by their diverse backgrounds and experiences. We are proud to support our employees and their families at every point along their work-life journey, offering some of the most attractive and comprehensive employee benefits and perks in the industry. There’s also ample opportunity to move about the business for those who show passion and grit in their work. To learn more about our offices across the globe, please copy and paste into your browser. Morgan Stanley is an equal opportunities employer. We work to provide a supportive and inclusive environment where all individuals can maximize their full potential. #J-18808-Ljbffr
-
Site Reliability Engineer
4 weeks ago
Montreal, Canada ApTask Full timeDirect message the job poster from ApTask Looking for an intermediate between 2 to 5 years' experience. The Application Infrastructure (Al) department is seeking a Site Reliability Engineer (SRE) to help drive the reliability engineering, operations and customer support services clients ServiceNow SaaS implementation. Reporting to a Site Reliability...
-
Senior Engineer, Reliability
20 hours ago
Montreal (administrative region), Canada VIA Rail Canada Full timeJoin to apply for the Senior Engineer, Reliability role at VIA Rail Canada Did you know that VIA Rail is carrying out ambitious projects to modernize its services and infrastructure? From our new ultramodern train fleet to ongoing improvement of our infrastructure, we’re building the future of transportation in Canada. Working for VIA Rail is being a part...
-
Senior Engineer, Reliability
2 days ago
Montreal (administrative region), Canada VIA Rail Canada Full timeJoin to apply for the Senior Engineer, Reliability role at VIA Rail Canada Did you know that VIA Rail is carrying out ambitious projects to modernize its services and infrastructure? From our new ultramodern train fleet to ongoing improvement of our infrastructure, we’re building the future of transportation in Canada. Working for VIA Rail is being a part...
-
Site Reliability Engineer
18 hours ago
Montreal, Canada Open Systems Technologies Full timeSite Reliability Engineer (SRE), ServiceNow, Application Infrastructure Location: Montreal – Hybrid – 3 days/week The Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive reliability engineering, operations and customer support services for client’s ServiceNow SaaS implementation. Reporting to a Site...
-
Site Reliability Engineer
3 days ago
Montreal, Canada Open Systems Technologies Full timeSite Reliability Engineer (SRE), ServiceNow, Application Infrastructure Location: Montreal – Hybrid – 3 days/week The Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive reliability engineering, operations and customer support services for client’s ServiceNow SaaS implementation. Reporting to a Site...
-
Site Reliability
2 days ago
Montreal (administrative region), Canada Canonical Full timeJoin to apply for the Site Reliability / Gitops Engineer role at Canonical. Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our...
-
Site Reliability
18 hours ago
Montreal (administrative region), Canada Canonical Full timeJoin to apply for the Site Reliability / Gitops Engineer role at Canonical . Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT....
-
Site Reliability
3 days ago
Montreal (administrative region), Canada Canonical Full timeJoin to apply for the Site Reliability / Gitops Engineer role at Canonical. Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our...
-
Senior Site Reliability Engineer
5 days ago
Montreal, Canada Orion Innovation Full timeOrion Innovation is a premier, award-winning, global business and technology services firm. Orion delivers game-changing business transformation and product development rooted in digital strategy, experience design, and engineering, with a unique combination of agility, scale, and maturity. We work with a wide range of clients across many industries...
-
Senior Site Reliability Engineer
20 hours ago
Montreal West, Canada Orion Innovation Full time1 week ago Be among the first 25 applicants Orion Innovation is a premier, award‑winning, global business and technology services firm. Orion delivers game‑changing business transformation and product development rooted in digital strategy, experience design, and engineering, with a unique combination of agility, scale, and maturity. We work with a wide...