Site Reliability Engineer

3 weeks ago


Vancouver BC, Canada Stafflink Full time

Job Description

Position: Site Reliability Engineer

Duration: 12 Months

Location: Principally remote, with at least one day per month in office for applicants in the lower mainland. Local candidates are given preference.

Work hours: Monday – Friday, 9:00 am – 5:00 pm PST

Reference: RITM0091997

Specific Responsibilities and Deliverables:
  1. Serve as the subject matter expert (SME) for Dynatrace, responsible for configuring, optimizing, and managing Dynatrace monitoring solutions.
  2. Design and implement monitoring strategies using Dynatrace to ensure comprehensive visibility into system performance, availability, and reliability.
  3. Collaborate with our Engineering & Platform teams to ensure our services, platforms and infrastructure are emitting the right metrics.
  4. Lead the rollout and adoption of Observability practices, tools, and frameworks across teams and projects.
  5. Collaborate with Incident Management teams to resolve critical incidents, conduct post-incident reviews, and implement preventive measures.
  6. Communicate complex information clearly and concisely, to explain various business and technical information.
  7. Proactively identify and mitigate potential issues, bottlenecks, and performance degradation to ensure system reliability and uptime.
  8. Drive automation initiatives using tools like Ansible, Terraform, or Kubernetes to streamline deployment, configuration, and management of infrastructure.
  9. Conduct capacity planning assessments, analyze resource utilization trends, and forecast capacity requirements to support business growth and scalability.
Mandatory Requirements:
  1. Bachelor's degree in Computer Science, Engineering, or related field; Master's degree preferred.
  2. Extensive and recent experience as a Site Reliability Engineer (SRE) with a focus on Dynatrace and Observability practices.
  3. Strong proficiency in Dynatrace monitoring solutions, including configuration, customization, and optimization.
  4. Hands-on experience with Observability tools and practices such as distributed tracing, logging, metrics collection, and anomaly detection.
  5. Experience with automation tools (Ansible, Terraform, Kubernetes) and Infrastructure as Code (IaC) principles.
  6. Solid understanding of cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes).
  7. Excellent problem-solving skills, analytical thinking, and the ability to troubleshoot complex technical issues.
  8. Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams and drive initiatives to completion.
  9. Relevant certifications (Dynatrace, AWS, Kubernetes, etc.) are a plus.
  10. Local candidates or candidates willing to attend occasional on-site meetings are preferred.

What's In It For You?

  • Meaningful Impact: Contribute to impactful public sector projects.
  • Career Growth: Advance professionally in a dynamic team environment.
  • Collaboration: Foster relationships within and outside the organization.
  • Professional Development: Gain exposure to best practices.
  • Referral program:
#J-18808-Ljbffr

  • Vancouver, BC, Canada Sigmaways Inc Full time

    We're seeking a Site Reliability Engineer to join our team with expertise in Kubernetes and troubleshooting.Responsibilities:Monitor, measure, and report alerts, overall health, performance, and capacity of one or more services.Gain deep knowledge and learn the application stack.Ability to debug and optimize code and automate routine tasks.Function well in a...


  • Vancouver, BC, Canada LayerZero Full time

    LayerZero The Future is Omnichain. Founded in 2021, LayerZero’s vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities. With LayerZero's simple, generic messaging protocol, builders will develop cross-chain dApps designed to unify the power of individual...


  • Vancouver, BC, Canada Dapper Labs Full time

    We’re looking for a Site Reliability Engineer who wants to be at the technical core of an organization that’s completely reshaping how distributed applications on blockchains can reach massive audiences. You will join a Site Reliability Engineering team that has the ability to architect, build, and iterate on resilient, scalable systems. SRE also...


  • Vancouver, BC, Canada RAZR Marketing, Inc. Full time

    Our office is located inVancouver, BC. Candidates must reside in the Vancouver area and will be required to be in-office three times per week: You may work remotely, if you wish, two days per week. You can't wait to get out of bed in the morning andget on with your day We are seeking a skilled and motivated Site Reliability Engineer (SRE) to join our...


  • Vancouver, BC, Canada Jotform Full time

    ABOUT JOTFORMJotform is a San Francisco-based SaaS company with more than 25 million users worldwide. We are thriving and growing, and we’ve never needed outside funding. That’s because we like keeping things agile, independent, and fun. Jotform believes everyone should be able to create their own online forms. Our 10,000+ ready-made form templates, 100+...


  • Vancouver, BC, Canada Jotform Full time

    ABOUT JOTFORMJotform is a San Francisco-based SaaS company with more than 25 million users worldwide. We are thriving and growing, and we’ve never needed outside funding. That’s because we like keeping things agile, independent, and fun. Jotform believes everyone should be able to create their own online forms. Our 10,000+ ready-made form templates, 100+...


  • Vancouver, Canada Dapper Labs Full time

    We’re looking for a Site Reliability Engineer who wants to be at the technical core of an organization that’s completely reshaping how distributed applications on blockchains can reach massive audiences. You will join a Site Reliability Engineering team that has the ability to architect, build, and iterate on resilient, scalable systems. SRE also...


  • Vancouver, Canada LayerZero Full time

    LayerZero The Future is Omnichain. Founded in 2021, LayerZero’s vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities. With LayerZero's simple, generic messaging protocol, builders will develop cross-chain dApps designed to unify the power of individual blockchains. We...


  • Vancouver, British Columbia, Canada Axiom Zen Full time

    We're looking for a Site Reliability Engineer who wants to be at the technical core of an organization that's completely reshaping how distributed applications on blockchains can reach massive audiences.You will join a Site Reliability Engineering team that has the ability to architect, build, and iterate on resilient, scalable systems.SRE also guides the...


  • Vancouver, BC, Canada Jotform Full time

    ABOUT JOTFORM Jotform is a San Francisco-based SaaS company with more than 25 million users worldwide. We are thriving and growing, and we’ve never needed outside funding. That’s because we like keeping things agile, independent, and fun. Jotform believes everyone should be able to create their own online forms. Our 10,000+ ready-made form templates,...


  • Vancouver, BC, Canada Dapper Labs Full time

    We’re looking for a Site Reliability Engineer who wants to be at the technical core of an organization that’s completely reshaping how distributed applications on blockchains can reach massive audiences. You will join a Site Reliability Engineering team that has the ability to architect, build, and iterate on resilient, scalable systems. The support we...


  • Vancouver, BC, Canada Taurus SA Full time

    Senior Site Reliability Engineer [Vancouver] Employee System and Network Administration Are you ready to take on an entrepreneurial challenge in the digital asset industry? Taurus, a global leader in digital asset infrastructure, has an exciting opportunity for you. Founded in April 2018, Taurus provides enterprise-grade solutions to issue, custody, and...


  • Vancouver, Canada Sigmaways Inc Full time

    We're seeking a Site Reliability Engineer to join our team with expertise in Kubernetes and troubleshooting.Responsibilities:Monitor, measure, and report alerts, overall health, performance, and capacity of one or more services.Gain deep knowledge and learn the application stack.Ability to debug and optimize code and automate routine tasks.Function well in a...


  • Vancouver, Canada Dapper Labs Full time

    We’re looking for a Site Reliability Engineer who wants to be at the technical core of an organization that’s completely reshaping how distributed applications on blockchains can reach massive audiences.You will join a Site Reliability Engineering team that has the ability to architect, build, and iterate on resilient, scalable systems. SRE also guides...


  • Vancouver, Canada Targeted Talent Full time

    We are looking for an experienced Senior Site Reliability Engineer for our client. This is a permanent position that is remote to start with later relocation to Calgary or Winnipeg. Our client is a global enterprise company with a product that you've likely used. Experience with coding/software development, along with Site Reliability will be the key to...


  • Vancouver, Canada Sentry Full time

    About the role The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1,000,000+ developers. Sentry receives over a billion events a day, and processes...


  • Vancouver, BC, Canada Flexton Inc. Full time

    Location: Vancouver, Canada5+ years of working experience; Extensive/Strong AWS experience---experience in designing, deploying managing scalable/reliable cloud-based infrastructure; Software Engineering background/experience---Python, Javascript, Bash, etc.;In-depth knowledge of infrastructure as code (IaC) tools, like Terraform, GHA, CloudFormation,...


  • Vancouver, BC, Canada Flexton Inc. Full time

    Location: Vancouver, Canada5+ years of working experience; Extensive/Strong AWS experience---experience in designing, deploying managing scalable/reliable cloud-based infrastructure; Software Engineering background/experience---Python, Javascript, Bash, etc.;In-depth knowledge of infrastructure as code (IaC) tools, like Terraform, GHA, CloudFormation,...


  • Vancouver, Canada TEEMA Full time

    MUST LIVE IN CANADA NEAR AN AIRPORTLooking for a technical lead with 10+ years of DevOps/SRE experienceMUST HAVE - 5+ years permanent residence or Citizenship (cant have lived out of Canada for the last 5 years)MUST LIVE IN CANADA NEAR AN AIRPORTLooking for a technical lead with 10+ years of DevOps/SRE experienceMonitoring and logging services are a must 2...


  • Vancouver, BC, Canada Flexton Inc. Full time

    Location: Vancouver, Canada 5+ years of working experience; Extensive/Strong AWS experience ---experience in designing, deploying managing scalable/reliable cloud-based infrastructure; Software Engineering background /experience--- Python, Javascript, Bash, etc. ; In-depth knowledge of infrastructure as code (IaC) tools, like Terraform, GHA,...