Site Reliability Engineer
3 weeks ago
We are looking for an exceptional Site Reliability Engineer to join our Network and Security Operations Center team. As a key member of our team, you will be responsible for ensuring the reliability and uptime of our platform and applications.
Key Responsibilities:- Collaborate with Engineering teams to support services through system design consulting, software development, capacity planning, and launch reviews.
- Maintain services once live by measuring and monitoring availability, latency, and overall system health.
- Develop tools and automation on top of Azure and AWS to reduce manual intervention.
- Scale systems sustainably through automation and evolve systems to improve reliability and velocity.
- Be on-call and practice sustainable incident response and blameless postmortems.
- Implement automated solutions for continuous integration and delivery (CI/CD).
- Implement monitoring, logging, alerting, and SLA reporting.
- Implement service monitoring dashboards displaying key metrics.
- Create and maintain technical documentation.
- Apply SRE best practices.
- Take command of high-severity incidents and facilitate their resolution.
- Bachelor's degree in computer science or related technical discipline.
- At least 5 years' experience in systems engineering, with demonstrable technical experience in new platform development, orchestration, product ownership, and iterative design and deployment.
- Experience designing and deploying large-scale systems, multi-vendor platforms, and globally distributed infrastructure.
- Strong knowledge of system design, high-performance computing, file, block, and storage technologies, and integration of compute, storage, and network technologies.
- High-level understanding and examples of executing projects with full-stack automation.
- Self-organize, collaborate, and manage efforts with peers and teams across responsibility areas, languages, geography, and time zones.
- Be a self-starter, curious, and not afraid to ask questions and challenge the way things are done today.
- See a problem or opportunity, take ownership, and act on it independently.
- Knowledge of Datadog, Rapid7 Insight, AWS, Azure, Java, .Net, GitLab, and SaaS company experience are preferred.
-
Site Reliability Engineering Leader
6 days ago
Toronto, Ontario, Canada Royal Bank of Canada Full timeRoyal Bank of Canada is seeking a highly skilled Site Reliability Engineering (SRE) leader to join our team in Toronto, Canada. As an SRE leader, you will be responsible for leading the development and implementation of SRE solutions that improve the reliability and performance of our applications.The ideal candidate will have 5+ years of experience as a...
-
Site Reliability Engineer
6 days ago
Toronto, Ontario, Ontario, Canada PointsBet Canada Full timeSITE RELIABILITY ENGINEER ABOUT THE ROLEAs a Site Reliability Engineer (SRE), you will ensure the reliability, scalability, and performance of our product. You will lead efforts in proactive monitoring, incident management, automation, collaborating across teams to implement best practices in reliability engineering. Your expertise will drive resilient...
-
Site Reliability Engineering Lead
3 weeks ago
Toronto, Ontario, Canada Compunnel Inc. Full timeCompunnel Inc. is a leading provider of innovative technology solutions.We are seeking an experienced Site Reliability Engineering Lead to join our team in Toronto, Canada.The estimated salary for this position is $170,000 per year, considering the location and industry standards.About the JobThis role is perfect for someone who is passionate about driving...
-
Staff Site Reliability Engineer
1 month ago
Toronto, Ontario, Canada Index Exchange Full timeAbout the Role:We are seeking a highly skilled Staff Site Reliability Engineer to own and develop on-premise and hybrid cloud environments, focusing on low-latency performance on Kubernetes platforms supporting a robust developer experience framework.The ideal candidate will have a deep technical understanding of on-premise and hybrid cloud architectures and...
-
Chief Site Reliability Engineer
3 weeks ago
Toronto, Ontario, Canada Index Exchange Full timeAbout Index ExchangeWe have a rich history of shaping the earliest forms of ad tech, and we're now looking for talented engineers to help drive its future. Our customers face unique challenges that require technical expertise at internet scale.Our infrastructure handles over 450 billion requests daily, all running in our own global data centers. We provide...
-
Cloud Native Site Reliability Engineer
3 weeks ago
Toronto, Ontario, Canada Thomson Reuters Full timeWe are seeking an experienced Senior SRE to join our Shared Capabilities, Service Reliability and Operation team in Toronto. As a Cloud Native Site Reliability Engineer, you will be responsible for implementing site reliability engineering and DevOps best practices, building and maintaining monitoring for all aspects of infrastructure, micro-services, usage...
-
Site Reliability Team Lead
4 weeks ago
Toronto, Ontario, Canada Sentry Full timeAbout SentryWe're on a mission to help developers write better software faster, so we can get back to enjoying technology. With more than $217 million in funding and 100,000+ organizations that believe we're on to something, we're building performance and error monitoring tools that help companies like Disney, Microsoft, and Atlassian spend less time fixing...
-
Highly Skilled Site Reliability Engineer
5 days ago
Toronto, Ontario, Canada Compunnel Inc. Full timeAt Compunnel Inc., we are looking for a talented Senior Site Reliability Engineer/DevOps to join our team. This is a challenging opportunity to work with the latest tools and technologies to drive forward Automation, Observability and CI/CD automation.The ideal candidate is passionate about driving SRE DevSecOps mindset and culture in a fast-paced...
-
Site Reliability Specialist
4 weeks ago
Toronto, Ontario, Canada Royal Bank of Canada> Full timeJob SummaryWe are seeking a talented Site Reliability Engineer to join our Digital team at Royal Bank of Canada. This is an exciting opportunity to accelerate our cloud native initiatives and make a difference in the industry.Job DescriptionWe are looking for an individual who embodies leadership, mentorship, and decision-making qualities. As a Site...
-
Reliability Engineering Specialist
3 weeks ago
Toronto, Ontario, Canada Criteo Full timeAbout the Role:This is a challenging opportunity for an experienced engineer to join Criteo's PRE team as a Site Reliability Engineer. The role involves working closely with product engineering to improve the reliability of our apps, systems, and pipelines, assessing where optimization is needed most, and telling stories with meaningful monitoring.Key...
-
Global Site Reliability Engineer
3 weeks ago
Toronto, Ontario, Canada mccainfood Full timeJob SummaryWe are seeking a highly skilled Global Site Reliability Engineer to join our team. As a key member of our organization, you will be responsible for ensuring the reliability, performance, and scalability of our global communication services.
-
Reliability Engineering Lead
2 days ago
Toronto, Ontario, Canada Royal Bank of Canada Full timeJob SummaryRoyal Bank of Canada is seeking an experienced professional to lead our Site Reliability Engineering (SRE) efforts for our US Cash Management Technology. This is a unique opportunity to shape the future technology landscape of the company, delivering key business values and implementing strategic components across all RBC functions defined in our...
-
Reliability Engineering Manager
4 weeks ago
Toronto, Ontario, Canada Estée Lauder Companies Full timeReliability Engineering Manager RoleWe are seeking a highly skilled Reliability Engineering Manager to join our team at Estée Lauder Companies. As a key member of the Plant Management Team, you will be responsible for leading maintenance and reliability processes to achieve operational excellence.The ideal candidate will have a strong background in plant...
-
Reliability Engineering Expert
4 weeks ago
Toronto, Ontario, Canada Criteo Full timeAbout the Role:Criteo is seeking a talented Site Reliability Engineer to join our PRE team.What You'll Do: As a Site Reliability Engineer, you'll work closely with product engineering to improve the reliability of our apps, systems, and pipelines. You'll assess where optimization is needed most and tell stories with meaningful monitoring.How You'll Make an...
-
Senior Site Reliability Engineering Manager
3 weeks ago
Toronto, Ontario, Canada Index Exchange Full timeAbout Index ExchangeWe are shaping the future of ad tech and seeking an experienced Senior Site Reliability Engineering Manager to lead our SRE team.As a key member of our technical leadership, you will be responsible for building and managing a high-performing SRE team, fostering a culture of innovation, collaboration, and accountability. You will provide...
-
Site Reliability Engineer
1 month ago
Toronto, Ontario, Canada SGS Full timeJob SummaryThe Site Reliability Engineer will play a critical part in ensuring the reliability, supportability, scalability, and performance of our .NET stack applications built with ASP.NET MVC, Angular, and Web API.Key ResponsibilitiesPartner with developers and product operations teams to understand application requirements and translate them into...
-
Site Reliability Engineer/DevOps Expert
6 days ago
Toronto, Ontario, Canada Compunnel Inc. Full timeAbout Compunnel Inc.Compunnel Inc. is a fast-paced and dynamic company seeking a skilled Site Reliability Engineer/DevOps Expert to join our team.Job DescriptionWe are looking for an experienced professional who can drive the SRE DevSecOps mindset and culture in our organization. The ideal candidate will have a strong passion for driving automation,...
-
Site Reliability Engineer for Long-Term Project
4 weeks ago
Toronto, Ontario, Canada Lorven Technologies Full timeWe are seeking a skilled Site Reliability Engineer to support our long-term project in a hybrid environment. The successful candidate will have strong expertise in Azure and OpenShift, as well as experience with Dynatrace/ELK/Splunk for monitoring and observability.Key Responsibilities:Develop SRE solutions (monitoring and alerting, machine learning anomaly...
-
Reliability Systems Engineer
4 days ago
Toronto, Ontario, Canada Teranet Inc. Full timeAbout TeranetTeranet is a leading innovator in electronic services and solutions, operating one of the most advanced and secure registration systems worldwide.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our DevOps team. The ideal candidate will possess strong software engineering principles and infrastructure expertise to...
-
Senior Site Reliability Engineer
4 weeks ago
Toronto, Ontario, Canada Thomson Reuters Full timeAbout the RoleIn this opportunity as a Senior Site Reliability Engineer, you will:Identify options for problem resolution and initiate action.Engage others as appropriate and escalate as required.Liaise with various application development and content teams, customer service teams, and other software and hardware support teams.Proactively monitor production...