Current jobs related to Lead Site Reliability Engineer - Canada - CoreTek Labs

  • Lead Engineer

    1 month ago


    Canada Replicant Full time

    Engineering Lead - Site Reliability OperationsOversee the Site Reliability Operations team in a fully remote settingEnhance technical and product excellence with an emphasis on outstanding customer satisfactionCollaborate with Engineering Executives to develop a resilient cloud infrastructureEncourage a "shift left" mindset for platform dependability,...


  • Canada Remote Sensing Full time $165,000 - $190,000

    About UsRemote Sensing is at the forefront of defining observability and enhancing expectations of developer tools. As we continue to grow, we are proud to have achieved significant milestones, including recent funding and recognition as one of America's Best Startups.Our CultureWe embrace a remote-first approach, valuing the impact of your contributions...


  • Canada Shopify Full time

    About the RoleWe're seeking a skilled Site Reliability Engineer to join our team at Shopify. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our planet-scale systems.Our engineering culture is deeply collaborative, truth-seeking, and merchant-obsessed. We believe that a great engineer can apply and...


  • Canada Shopify Full time

    About the RoleWe're seeking a skilled Site Reliability Engineer to join our team at Shopify. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our planet-scale systems.Our engineering culture is deeply collaborative, truth-seeking, and merchant-obsessed. We believe that a great engineer can apply and...


  • Canada Shopify Full time

    About the RoleWe're seeking a skilled Site Reliability Engineer to join our team at Shopify. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our planet-scale systems.Our engineering culture is deeply collaborative, truth-seeking, and merchant-obsessed. We believe that a great engineer can apply and...


  • Canada Shopify Full time

    Shopify is seeking a skilled Site Reliability Engineer to join its team, responsible for ensuring the resilience and performance of its planet-scale systems. About the role We're looking for experienced and curious software engineers to help build the future of commerce. Our engineering culture is deeply collaborative, truth-seeking, candid, and...


  • Canada Granicus, Inc. Full time

    About Granicus, Inc.Granicus, Inc. is a leading provider of citizen engagement technologies and services for the public sector, dedicated to bringing governments closer to the people they serve with innovative solutions.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at Granicus, Inc. As a Site Reliability Engineer, you...


  • Canada Granicus, Inc. Full time

    About Granicus, Inc.Granicus, Inc. is a leading provider of citizen engagement technologies and services for the public sector, dedicated to bringing governments closer to the people they serve with innovative solutions.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at Granicus, Inc. As a Site Reliability Engineer, you...


  • Canada Flashpoint Full time

    Position Overview: Flashpoint is seeking a remote Lead Site Reliability Engineer I. The compensation range is approximately ca$145,000/yr - ca$165,000/yr, along with performance bonuses.About Us: At Flashpoint, we value collaboration and selflessness. Our team members thrive on working with infrastructure and reliability initiatives while prioritizing the...


  • Canada Operant AI, Inc. Full time

    About Operant AI, Inc.Operant AI, Inc. is a leading provider of cloud-native security solutions. We are passionate about bringing state-of-the-art technological innovations from Operating Systems/Distributed Systems/AI to the world of cloud-native security.Job SummaryWe are seeking a highly skilled Staff Site Reliability Engineer to join our team. As our...


  • Canada Operant AI, Inc. Full time

    About Operant AI, Inc.Operant AI, Inc. is a leading provider of cloud-native security solutions. We are passionate about bringing state-of-the-art technological innovations from Operating Systems/Distributed Systems/AI to the world of cloud-native security.Job SummaryWe are seeking a highly skilled Staff Site Reliability Engineer to join our team. As our...


  • Canada I Can Infotech Full time

    Job SummaryWe are seeking a highly skilled and motivated Junior Site Reliability Engineer to join our team at I Can Infotech. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining scalable and reliable systems that meet the needs of our business.Key ResponsibilitiesInfrastructure ManagementAssist in...


  • Canada I Can Infotech Full time

    Job SummaryWe are seeking a highly skilled and motivated Junior Site Reliability Engineer to join our team at I Can Infotech. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining scalable and reliable systems that meet the needs of our business.Key ResponsibilitiesInfrastructure ManagementAssist in...


  • Canada Operant AI, Inc. Full time

    Senior Site Reliability Engineer - Remote in US/Canada As the inaugural SRE member at Operant AI, Inc., you will play a pivotal role in shaping our SRE strategy and establishing the necessary frameworks to ensure our platforms and services remain robust and secure. If you are passionate about being a foundational engineer in a startup that is set to...


  • Canada Operant AI, Inc. Full time

    Senior Site Reliability Engineer - Remote in US/Canada As the inaugural SRE member at Operant AI, Inc., you will play a pivotal role in shaping our SRE strategy and establishing the necessary frameworks to ensure our platforms and services remain robust and secure. If you are passionate about being a foundational engineer in a startup that is set to...


  • Canada Bluebayinvest Full time

    Job SummaryWe are seeking a skilled Site Reliability Engineer to join our team at Bluebayinvest. As a Site Reliability Engineer, you will play a critical role in ensuring the stability, scalability, and performance of our systems and applications.Key ResponsibilitiesDevelop, support, and maintain our Splunk Enterprise log analytics platform to ensure...


  • Canada Bluebayinvest Full time

    Job SummaryWe are seeking a skilled Site Reliability Engineer to join our team at Bluebayinvest. As a Site Reliability Engineer, you will play a critical role in ensuring the stability, scalability, and performance of our systems and applications.Key ResponsibilitiesDevelop, support, and maintain our Splunk Enterprise log analytics platform to ensure...


  • Canada Float Full time

    About FloatFloat is recognized as the premier software solution for teams to effectively manage their time. As a certified B Corporation, we are dedicated to making a meaningful impact on our team, clients, the environment, and the remote community. Our team consists of 50 professionals working entirely remotely, committed to achieving our Best Work Life. As...


  • Canada Lead Discovery GmbH Full time

    Job DescriptionCompany OverviewAt Lead Discovery GmbH, we are a leading developer data platform, transforming industries and empowering developers to build amazing applications that people use every day.Job SummaryWe are seeking a highly skilled Senior Software Engineer to join our Server Programmability (SP) Team. As a key member of this team, you will be...


  • Canada Okta, Inc. Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Customer Identity team at Okta, Inc. As a Staff Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our Customer Identity Cloud product.Key ResponsibilitiesCollaborate with engineering teams to improve the availability,...

Lead Site Reliability Engineer

3 months ago


Canada CoreTek Labs Full time

Position: SRE Lead/Architect

Location: Canada (Remote)

Experience: 10+ years

Contract to Hire


Job Description:

  • Minimum of 10+ years of total work experience in Performance Engineering, DevOps, and Site Reliability Engineering.
  • Fluent in English with excellent stakeholder communication skills.
  • Candidate must have been in the country for 2+ years.
  • Certification in AWS mandatory.
  • Strong analytical and problem-solving skills.
  • Ability to thrive in a fast-paced environment and adapt to changing priorities.


Responsibilities:

  • Lead and mentor a team of SREs to ensure operational excellence and maximize the reliability and availability of client systems.
  • Architect and design highly scalable and available infrastructure solutions, integrating best practices in reliability engineering and automation.
  • Collaborate with cross-functional teams (DevOps, Development, IT) to implement SRE principles throughout the software development life cycle.
  • Establish and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for critical services, monitoring and maintaining performance against defined targets.
  • Implement and enhance observability, alerting, and incident response processes to proactively address issues and minimize downtime.
  • Drive continuous improvement initiatives, identifying bottlenecks and optimizing within the infrastructure and application stack.
  • Develop and maintain documentation related to system architecture, configuration, and procedures.
  • Stay current with industry trends, recommending and adopting new tools and practices to enhance system reliability.


Qualifications:

  • Strong background in designing and implementing highly available and scalable infrastructure.
  • Proficiency in scripting and automation using Python or Shell
  • Experience with container orchestration platforms, serverless architectures, CI/CD pipelines, and IaC implementations. (Ansible & Terraform)
  • Experience with Observability tools (preferred: Datadog, CloudWatch).
  • In-depth knowledge of cloud computing platforms (preferred: AWS).
  • Solid understanding of SRE/DevOps principles and practices.
  • Excellent problem-solving skills with the ability to troubleshoot complex issues in production environments.
  • Strong communication and leadership skills, fostering effective collaboration with cross-functional teams.
  • Relevant certifications in SRE, DevOps, Cloud, etc., are a plus.
  • Proficiency in coding with a strong understanding of software designs.
  • Ability to conduct performance engineering and implement improvements across design, code, and operations in both application and infrastructure tiers.
  • Mandatory experience in AWS services with certifications in Amazon Web Services. Hands-on design and development experience related to AWS Lambda.
  • Experience in frontend technologies such as Angular, JavaScript, and HTML.
  • Experience in GraphQL, NodeJS, and ExpressJS.
  • Experience in MongoDB and SQL RDS.
  • Experience in owning and maintaining microservices architecture.
  • Awareness of mobile applications and IOS platforms.