Current jobs related to Site Reliability Engineer - Toronto - SGS


  • Old Toronto, Canada TD Full time

    Job OverviewWe are seeking a highly skilled Site Reliability Engineering Lead to join our team at TD. As a key member of our technology group, you will be responsible for ensuring the stability, scalability, and reliability of our platforms.About the RoleThe ideal candidate will have a minimum of 8 years of experience in site reliability engineering, with a...


  • Toronto, Ontario, Canada Royal Bank of Canada Full time

    Royal Bank of Canada is seeking a highly skilled Site Reliability Engineering (SRE) leader to join our team in Toronto, Canada. As an SRE leader, you will be responsible for leading the development and implementation of SRE solutions that improve the reliability and performance of our applications.The ideal candidate will have 5+ years of experience as a...


  • Old Toronto, Canada Street Context Full time

    p>Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale? p>We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street Context. We provide a premium Email, Analytics and Broker Relationship platform, purpose-built for capital markets and...


  • Old Toronto, Canada Soda Full time

    Job Description Job Title: Site Reliability Engineer Location: Poland - Fully Remote Salary: 324K PLN or 27.3K monthly Start: ASAP Stack: AWS, Docker, Kubernetes, Terraform, Jenkins, Ansible, Linux, JavaScript, and Lambda. Are you a seasoned DevOps/SRE professional passionate about building high-performance, scalable systems? I am working with a Media/IT...


  • Old Toronto, Canada Sentry Full time

    p>The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance, and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1,000,000+ developers. Sentry receives over a billion events a day and processes terabytes of...


  • Old Toronto, Canada Olx Full time

    p>Site Reliability EngineerRemote Poland, PolandOLX – Engineering / Full-time / Remote At OLX, we work together to build a more sustainable world through trade. We make it safe, smart, and convenient to buy and sell cars, find housing, get jobs, buy and sell household goods, and more. Our colleagues around the world help to serve millions of people around...


  • Old Toronto, Canada Thomson Reuters Full time

    h3>(Canada) Site Reliability Engineer (Contract)Contract (9 months 4 days)Published 3 days agoNew RelicData DogSite Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will analyze chronic...


  • Old Toronto, Canada Mastech Inc. Full time

    Mastech Digital is an IT Staffing and Digital Transformation Services company.Mastech Digital provides digital and mainstream technology staff as well as Digital Transformation Services for all American Corporations. We are currently seeking a Site Reliability Engineer (GCP) for our client in the Consulting domain. We value our professionals, providing...


  • Old Toronto, Canada Tecsys Inc. Full time

    p>Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...


  • Old Toronto, Canada Tbwa ChiatDay Inc Full time

    Automate and Optimize Brick and Mortar RetailFocal Systems is the industry leader in retail AI solutions, revolutionizing brick and mortar retail with deep learning computer vision. As a Silicon Valley-based startup, we have more than doubled in size every year since inception.Our MissionWe are looking for smart, creative, and passionate individuals who want...


  • Old Toronto, Canada Tecsys Full time

    Tecsys is a fast-growing innovator offering supply chain solutions to industry-leading healthcare systems, hospitals, and pharmacy businesses to distributors, retailers, and 3PLs. As a Cloud Infrastructure Specialist, you will be responsible for ensuring the reliability and uptime of our platform and applications in a data-driven way to support internal and...


  • Old Toronto, Canada Ascend Fundraising Solutions Full time

    We are currently seeking a full-time Site Reliability Engineer to join our IT team. In this role, you will collaborate closely with the client services team to diagnose, troubleshoot, and resolve issues related to system reliability.RESPONSIBILITIES:Take ownership of customer-reported issues and see problems through to resolution.Develop preventive measures...


  • Old Toronto, Canada Tecsys Full time

    p>Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...


  • Old Toronto, Canada RBC Full time

    About the RoleWe are seeking an experienced Senior Site Reliability Engineer to join our US Cash Management Technology team at RBC. As a key member of our team, you will be responsible for leading the development, implementation, and support of Site Reliability Engineering (SRE) solutions for applications supported by the Commercial, Core Banking, and...


  • Old Toronto, Canada Sentry Full time

    Bad software is everywhere, and we’re tired of it. Sentry is on a mission to help developers write better software faster, so we can get back to enjoying technology.With more than $217 million in funding and 100,000+ organizations that believe we’re on to something, we're building performance and error monitoring tools that help companies like Disney,...


  • Old Toronto, Canada Sentry Full time

    Bad software is everywhere, and we’re tired of it. Sentry is on a mission to help developers write better software faster, so we can get back to enjoying technology.With more than $217 million in funding and 100,000+ organizations that believe we’re on to something, we're building performance and error monitoring tools that help companies like Disney,...


  • Old Toronto, Canada Loblaw Companies Ltd - Head Office Full time

    Cloud Engineering OpportunityWe are seeking an experienced Site Reliability Engineer to join our team at Loblaw Companies Ltd - Head Office. This role offers a unique opportunity to design, develop, and maintain cloud native solutions using services like Kubernetes, AppEngine, Cloud Functions, CloudSql, BigQuery, Pub/Sub on Google Cloud Platform and...


  • Toronto, Ontario, Canada Peter Lucas Project Management Inc. Full time

    Job OverviewA leading project management company, Peter Lucas Project Management Inc., is seeking a skilled Reliability Engineering Specialist to join their team. This critical role involves developing and implementing asset maintenance strategies, conducting root cause analysis, creating risk mitigation plans, and optimizing preventative maintenance...


  • Toronto, Ontario, Canada Compunnel Inc. Full time

    At Compunnel Inc., we are looking for a talented Senior Site Reliability Engineer/DevOps to join our team. This is a challenging opportunity to work with the latest tools and technologies to drive forward Automation, Observability and CI/CD automation.The ideal candidate is passionate about driving SRE DevSecOps mindset and culture in a fast-paced...


  • Toronto, Canada Compunnel Inc. Full time

    Hi,Good Morning,Hope you are doing well.Please let me know if you are interested in this position.Title: Sr SRE Lead Location: Toronto, Canada (Day 1 Onsite – Hybrid 3 days onsite) Job Description:You are passionate about driving SRE DevSecOps mindset and culture in a fast-paced, challenging environment where you get the opportunity to work with a...

Site Reliability Engineer

3 months ago


Toronto, Canada SGS Full time
Job Description

The Site Reliability Engineer will play a critical part in ensuring the reliability, supportability, scalability, and performance of our .NET stack applications built with MVC, Angular, and Web API.

Partner with developers and product operations teams to understand application requirements and translate them into operational practices. Design, implement, and maintain infrastructure automation tools using Infrastructure as Code (IaC) methodologies. Monitor application health and performance metrics, proactively identifying and resolving potential issues. Implement incident response procedures to ensure timely resolution of outages and service disruptions. Establish and improve best practices for product solution design / architecture, and development. Participate in peer and team code reviews by developing comprehensive coding standards and guidelines to ensure consistency, maintainability, and quality in software development. By establishing clear protocols for code formatting, naming conventions, error handling, testing, and documentation, we can enhance code readability, reduce defects, and facilitate knowledge sharing among team members. Collaborate with engineers to develop and implement disaster recovery plans. Continuously improve monitoring and alerting processes to ensure efficient problem identification and resolution. Stay up-to-date on the latest advancements in .NET infrastructure and SRE best practices.

Qualifications

Bachelor degree required Minimum 3+ years of experience in a related technical role (, Systems Administrator, Network Engineer) required Experience with configuration management tools like Ansible, Puppet, or Chef preferred Azure experience required Familiarity with monitoring and alerting tools (.NET performance counters, Azure App Insight, Prometheus, Grafana) is a plus preferred Ability to manage and coordinate multiple projects in a fast paced, highly professional environment. While coding proficiency is not required, a strong understanding of the .NET ecosystem and a desire to delve into infrastructure and automation will be essential for success. Strong understanding of system administration principles, including operating systems (Windows Server preferred) and networking concepts. Familiarity with monitoring and alerting tools (.NET performance counters, Azure App Insight, Prometheus, Grafana) Ability to work independently and as part of a team