Current jobs related to Reliability Operations Engineer - Toronto, Ontario - Tecsys Inc.

  • Reliability Engineer

    1 month ago


    Old Toronto, Ontario, Canada Resolute Workforce Solutions Full time

    Resolute Workforce Solutions is a leading provider of staffing solutions for organizations seeking to improve their operational efficiency and effectiveness. We are dedicated to delivering high-quality talent to support our clients' goals and objectives.We are currently seeking a highly skilled Maintenance Reliability Engineer to join our team in Toronto....

  • Reliability Engineer

    1 month ago


    Old Toronto, Ontario, Canada Resolute Workforce Solutions Full time

    Resolute Workforce Solutions is a leading provider of staffing solutions for organizations seeking to improve their operational efficiency and effectiveness. We are dedicated to delivering high-quality talent to support our clients' goals and objectives.We are currently seeking a highly skilled Maintenance Reliability Engineer to join our team in Toronto....

  • Reliability Engineer

    2 weeks ago


    Old Toronto, Ontario, Canada TD Bank Full time

    Job Title: Site Reliability EngineerJob Summary:We are seeking a highly skilled Site Reliability Engineer to join our Technology Solutions team in Canada. As a key member of our team, you will be responsible for ensuring the reliability and performance of our technology infrastructure.Key Responsibilities:* Collaborate with cross-functional teams to design,...


  • Toronto, Ontario, Canada Metrolinx Full time

    Job Title: Senior Reliability EngineerJob Summary:Metrolinx is a leading transportation agency in the Greater Golden Horseshoe region, operating GO Transit, UP Express, and the PRESTO fare payment system. We are committed to providing reliable and efficient transportation services to our customers. As a Senior Reliability Engineer, you will play a critical...


  • Toronto, Ontario, Canada Bourse de Montreal Inc. Full time

    Job Title: Site Reliability EngineerAt Bourse de Montreal Inc., we're seeking a highly skilled Site Reliability Engineer to join our Global Technology Services (GTS) team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our technology infrastructure.Key Responsibilities:Evaluate new...


  • Toronto, Ontario, Canada Bourse de Montreal Inc. Full time

    Job Title: Site Reliability EngineerAt Bourse de Montreal Inc., we're seeking a highly skilled Site Reliability Engineer to join our Global Technology Services (GTS) team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our technology infrastructure.Key Responsibilities:Evaluate new...


  • Toronto, Ontario, Canada Bourse de Montreal Inc. Full time

    Job Title: Site Reliability EngineerAt Bourse de Montreal Inc., we're seeking a highly skilled Site Reliability Engineer to join our Global Technology Services (GTS) team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our technology infrastructure.Key Responsibilities:Evaluate new...


  • Toronto, Ontario, Canada Metrolinx Full time

    Job Title: Senior Reliability EngineerJob Summary:Metrolinx is a leading transportation agency in the Greater Golden Horseshoe region, operating GO Transit, UP Express, and the PRESTO fare payment system. We are committed to providing reliable and efficient transportation services to our customers. As a Senior Reliability Engineer, you will play a critical...


  • Toronto, Ontario, Canada Riverside Natural Foods Full time

    Reliability Engineering Co-op OpportunityRiverside Natural Foods is seeking a highly motivated and detail-oriented Reliability Engineering Co-op to join our team. As a Reliability Engineering Co-op, you will play a key role in supporting our Asset Management Reliability Program at multiple site locations.Key Responsibilities:Support the implementation of...


  • Toronto, Ontario, Canada Riverside Natural Foods Full time

    Reliability Engineering Co-op OpportunityRiverside Natural Foods is seeking a highly motivated and detail-oriented Reliability Engineering Co-op to join our team. As a Reliability Engineering Co-op, you will play a key role in supporting our Asset Management Reliability Program at multiple site locations.Key Responsibilities:Support the implementation of...


  • Toronto, Ontario, Canada Riverside Natural Foods Full time

    Reliability Engineering Co-op OpportunityRiverside Natural Foods is seeking a highly motivated and detail-oriented Reliability Engineering Co-op to join our team. As a Reliability Engineering Co-op, you will play a key role in supporting our Asset Management Reliability Program at multiple site locations.Key Responsibilities:Support the implementation of...


  • Toronto, Ontario, Canada Riverside Natural Foods Full time

    Reliability Engineering Co-op OpportunityRiverside Natural Foods is seeking a highly motivated and detail-oriented Reliability Engineering Co-op to join our team. As a Reliability Engineering Co-op, you will play a key role in supporting our Asset Management Reliability Program at multiple site locations.Key Responsibilities:Support the implementation of...


  • Toronto, Ontario, Canada KPMG Canada Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at KPMG Canada. As a key member of our Managed Services team, you will be responsible for ensuring the smooth operation of our cloud-based services.Key ResponsibilitiesDesign, implement, and maintain scalable and reliable cloud-based systemsCollaborate with...


  • Toronto, Ontario, Canada SGS Full time

    Job DescriptionThe Site Reliability Engineer will play a critical role in ensuring the reliability, supportability, scalability, and performance of our .NET stack applications built with MVC, Angular, and Web API.Partner with developers and product operations teams to understand application requirements and translate them into operational practices.Design,...


  • Toronto, Ontario, Canada SGS Full time

    Job Title: Site Reliability EngineerAt SGS, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, supportability, scalability, and performance of our.NET stack applications built with MVC, Angular, and Web API.Key Responsibilities:Partner with...


  • Toronto, Ontario, Canada SGS Full time

    Job Title: Site Reliability EngineerAt SGS, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, supportability, scalability, and performance of our.NET stack applications built with MVC, Angular, and Web API.Key Responsibilities:Partner with...


  • Toronto, Ontario, Canada Riverside Natural Foods Full time

    Reliability Engineering Co-op OpportunityRiverside Natural Foods is seeking a highly motivated and detail-oriented Reliability Engineering Co-op to join our team. As a key member of our Asset Management Reliability Program, you will play a crucial role in ensuring our equipment operates at optimum efficiency and effectiveness.Key Responsibilities:Support the...


  • Toronto, Ontario, Canada Riverside Natural Foods Full time

    Reliability Engineering Co-op OpportunityRiverside Natural Foods is seeking a highly motivated and detail-oriented Reliability Engineering Co-op to join our team. As a key member of our Asset Management Reliability Program, you will play a crucial role in ensuring our equipment operates at optimum efficiency and effectiveness.Key Responsibilities:Support the...


  • Toronto, Ontario, Canada Riverside Natural Foods Full time

    Reliability Engineering Co-op OpportunityRiverside Natural Foods is seeking a highly motivated and detail-oriented Reliability Engineering Co-op to join our team. As a key member of our Asset Management Reliability Program, you will play a crucial role in ensuring our equipment operates at optimum efficiency and effectiveness.Key Responsibilities:Support the...


  • Toronto, Ontario, Canada Riverside Natural Foods Full time

    Reliability Engineering Co-op OpportunityRiverside Natural Foods is seeking a highly motivated and detail-oriented Reliability Engineering Co-op to join our team. As a key member of our Asset Management Reliability Program, you will play a crucial role in ensuring our equipment operates at optimum efficiency and effectiveness.Key Responsibilities:Support the...

Reliability Operations Engineer

3 months ago


Toronto, Ontario, Canada Tecsys Inc. Full time

Embracing the benefits of remote work, such as enhanced employee satisfaction, productivity, and decreased commuting impact on well-being and the environment, we take pride in being a digital-first organization. The technologies and systems we have invested in have laid a solid groundwork for this approach. Our digital-first work culture, complemented by our strategically located offices and collaborative environments, empowers our team with the flexibility to work in a manner that maximizes their productivity.

About Us

Tecsys Inc. is a rapidly expanding innovator providing supply chain solutions to leading healthcare systems, hospitals, and pharmacy businesses, as well as distributors, retailers, and 3PLs. We collaborate with industry frontrunners to revolutionize their supply chains through technology. If you are passionate about addressing intriguing challenges with opportunities for continuous learning, Tecsys could be the right place for you.

About the Role

We are seeking a Reliability Operations Engineer to join our "Network and Security Operations Center" team. Our NOC team focuses on enhancing the reliability and uptime of our platforms and applications through data-driven strategies to meet the needs of both internal and external customers.

Your Responsibilities
  1. Work collaboratively with other engineering teams to support services prior to their launch through activities such as system design consultation, software platform and framework development, capacity planning, and launch reviews.
  2. Ensure services are maintained post-launch by measuring and monitoring availability, latency, and overall system health.
  3. Create tools and automation on Azure and AWS to minimize the need for manual intervention.
  4. Sustainably scale systems through automation and advocate for changes that enhance reliability and efficiency.
  5. Participate in on-call duties.
  6. Engage in sustainable incident response and conduct blameless postmortems.
  7. Implement automated solutions for continuous integration and delivery (CI/CD).
  8. Establish monitoring, logging, alerting, and SLA reporting mechanisms.
  9. Develop service monitoring dashboards showcasing key metrics.
  10. Generate and maintain technical documentation.
  11. Apply Site Reliability Engineering best practices.
  12. Take charge of high-severity incidents and guide their resolution.
  13. Support our planning and deployment teams to ensure stability, predictability, and scalability as we grow.
  14. Collaborate with the Platform Engineering team to implement and support extensive strategic initiatives, provide constructive feedback, and promote a collaborative atmosphere.
  15. Work cross-functionally with internal teams and vendors to manage our global growth, with a strong emphasis on maintaining high performance, availability, and reliability for our users.
Requirements
  1. Bachelor's degree in computer science or a related technical field.
  2. Minimum of 5 years' experience in systems engineering; demonstrable technical expertise in new platform development, orchestration, product ownership, and iterative design and deployment.
  3. Experience in designing and deploying large-scale systems, multi-vendor platforms, and globally distributed infrastructures.
  4. Strong understanding of system design; high-performance computing; file, block, and storage technologies; integration of compute, storage, and network technologies to deliver cohesive infrastructure solutions.
  5. High level of knowledge and experience executing projects with full-stack automation; our scale necessitates significant automation to reduce manual intervention and utilize both internal and open-source tools for day-to-day activities.
  6. Ability to self-organize, collaborate, and manage efforts with peers and teams across various responsibility areas, languages, geographies, and time zones.
  7. Be a self-starter, inquisitive, and willing to ask questions and challenge existing processes.
  8. Identify problems or opportunities, take ownership, and act independently.
  9. Familiarity with Datadog preferred (or at least a similar/equivalent product).
  10. Familiarity with Rapid7 Insight preferred (or at least a similar/equivalent product).
  11. Knowledge and experience with AWS or Azure required.
  12. Basic understanding of Java or .Net-based development required.
  13. Familiarity with GitLab (enterprise license) preferred (or at minimum, Jenkins required).
  14. Experience in a SaaS company is a strong asset.
  15. Strong English communication skills, both written and spoken, are essential for effective correspondence with customers, business partners, and colleagues beyond the province of Quebec.
Additional Requirements
  1. Participation in an on-call escalation rotation.
  2. Occasional travel (quarterly offsites, conferences – less than 10%).

At Tecsys, we are dedicated to cultivating a diverse and inclusive workplace where all employees feel valued, respected, and empowered. We believe that diversity fuels innovation and enhances our ability to deliver exceptional solutions. We welcome and encourage applicants from all backgrounds, experiences, and perspectives to join our team. Tecsys is an equal opportunity employer. Accommodation is available for applicants selected for an interview.

Note: If you are applying for this position, you must be a Canadian Citizen or a Permanent Resident of Canada, OR possess a valid Canadian work permit.