Site Reliability Engineer
1 week ago
Job Title: Site Reliability Engineer
Location: Montreal – Hybrid – 3 days/week
Term: 12 months contract plus extension
The Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive reliability engineering, operations and customer support services for client's ServiceNow SaaS implementation. Reporting to a Site Reliability Engineering & Operations Lead.
This position specializes in ServiceNow Software as a Service which provides a suite of IT service management capabilities and is integrated with many products such as chatbot technology, on-call escalation incident management, and a range of other on-premises infrastructure (including SQL databases, APIs, and web infrastructure).Despite the focus on value-add development and process delivery, this is also a production-side, operational role requiring participation in an on-call rotation from time to time.
Successful candidates for SRE roles in Application Infrastructure have so far come from a variety of backgrounds; maybe a developer today looking to evolve site reliability as a practice, or an infrastructure specialist with an interest in reliability and resilience principles, or a strong system admin who enjoys troubleshooting along with some task automation experience.
Prior experience in the financial services industry is not required, and we welcome candidates from all industries and backgrounds to apply.
Responsibilities include:
• Delivery of improvements that will maximize the availability and performance of supported systems through optimized and automated operational tasks, collaborating on the development of operational tools, ongoing problem management, and architecture reviews with colleagues.
• Troubleshooting ServiceNow issues, and also some on-premises capabilities in a Linux environment from time to time, collaborating with others get to the bottom of issues, and agreeing on lasting improvements that can be made.
• Exploring and delivering observability including metrics, logging, tracing and alerting that can define and measure the target reliability of a product.
• Being dependable and responsive during agreed hours, like when part of the on-call rotation with the rest of the global team (with a time-off in lieu system).
• A commitment to understanding the Firm's ServiceNow instances and related dependencies, contributing to their documentation.
• Identification and prioritization of technical debt that can impact client satisfaction or operational efficiency.
• Give feedback on policy and procedures related to the delivery of SRE and operational practices with a view to continually making the Firm safer and more efficient.
Skills required:
• The ideal candidate would have at least one of: Software development skills in one or more programming language, e.g. Python, ServiceNow administration or development experience.
• 10+ years of experience
• Proficient oral and written communication skills
• Establishing warm, effective relationships with colleagues to collaborate on successful delivery
• A dependable team worker with demonstrated commitment to client service
• Ability to respond appropriately during occasional technical emergencies, like outages.
• Open to work in on call rotation
Skills desired:
ServiceNow administration or development experience, although this can be acquired by the successful candidate via on the job and via training.
-
Site Reliability Engineer
1 week ago
Montreal, Quebec, Canada Roshan Consulting Services Full timeCompany DescriptionRoshan Consulting empowers businesses to optimize operations and enhance efficiency through innovative strategies and technologies tailored to their unique needs. Our mission is to drive digital transformation and deliver sustainable growth by offering services such as Robotic Process Automation (RPA), business process optimization, and...
-
Site Reliability Engineer
5 days ago
Montreal, Quebec, Canada Open Systems Technologies Full timeThe Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive the reliability engineering, operations and customer support services for Morgan Stanley's ServiceNow SaaS implementation. Reporting to a Site Reliability Engineering & Operations Lead.This role requires delivering a range of SRE practices within a...
-
Site Reliability Engineer
3 days ago
Montreal, Quebec, Canada Tecsys Inc. Full timeHaving recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...
-
ServiceNow Site Reliability Engineer
1 day ago
Montreal, Quebec, Canada Soho Square Solutions Full timeSite Reliability Engineer (SRE) – ServiceNow, Application InfrastructureExperience Level:Level 4 (Advanced) – 7 to 15 yearsDuration:1- Year ContractLocation:Montreal (Day 1 onsite onboarding; in-office presence 3x per week)Role Overview:The Application Infrastructure (AI) department is seeking aSite Reliability Engineer (SRE)to drive reliability...
-
Site Reliability Engineer
2 weeks ago
Montreal, Quebec, Canada Omiz Staffing Solutions (OSS) Full timePosition: Site Reliability EngineerLocation: Montreal, QC Canada (Hybrid – 3-4 days onsite in a week)Duration: Long-Term ContractJob DescriptionDelivery of improvements that will maximize the availability and performance of supported systems through optimized and automated operational tasks, collaborating on the development of operational tools, ongoing...
-
Senior Site Reliability Engineer
2 weeks ago
Montreal, Quebec, Canada Botpress Technologies Inc. Full timeDescription Help bring AI agents to companies worldwide. Over the next decade, autonomous agents will redefine how we work. Botpress allows companies to build and deploy advanced AI agents that move beyond conversation into real business logic. Our product works today and at scale, across industries, regions, and limitless use cases. As the 3rd...
-
Site Reliability Engineer
5 days ago
Montreal, Quebec, Canada Intelcom | Dragonfly Full timeIntelcom | DragonflyWith more than 100 sorting stations and operations across three continents,Intelcom | Dragonflyis Canada's leader in last-mile logistics. Our vision is clear:to deliver fast, accurate, and reliable service powered by cutting-edge technology.A Strategic Role at the Heart of LogisticsResponsibilitiesIncident Management: Detect and respond...
-
Site Reliability Engineering Manager
7 days ago
Montreal, Quebec, Canada Aduna Global Full timeAll Together, ExtraordinaryAtAduna, we're building the backbone of the global API economy. By connecting telecom operators, cloud platforms, and software innovators, we enable the next generation of digital communication services.ThePortalis a secure, role-based web app offering partners and internal teams asingle placeto access information, collaborate, and...
-
Site Reliability Engineer
1 week ago
Montreal, Quebec, Canada TMC Canada Full time| Summary :The Private Cloud SRE L3 team is part of the Enterprise Computing organization. The team has presence in cities globally and is focused on supporting cloud and container-based platforms for internal and external clients. You will integrate with the global follow the sun operations model, which translates to responsibility for technologies...
-
Montreal, Quebec, Canada Stingray Full timeAt Stingray, creativity, collaboration, and innovative technology are the pillars of our DNA. Are you ready to rock your career by joining a growing company, a team of music enthusiasts in a stimulating and fun work environment?We are currently looking for aSoftware developer – Site Reliability Engineeringto join our SRE team. This position reports to...