Staff Infrastructure Engineer, Observability Solutions Specialist

2 weeks ago


Toronto, Ontario, Canada Lyft Full time

At Lyft, our mission is to revolutionize transportation with innovative solutions. To achieve this, we rely on our Infrastructure team to build scalable software that solves complex problems. As an Observability team member, you will play a crucial role in ensuring the operational health of our logging and metrics infrastructure. You will monitor system availability, take a holistic view of our platform performance, and build software to automate infrastructure platform operations and management. By measuring and monitoring our operations, you will identify opportunities to improve our systems and push our platform forward.

Responsibilities:

  • Provide technical mentorship within the team and lead by example in developing robust, scalable, and efficient observability solutions.
  • Drive cross-functional collaboration with engineering teams to advance Lyft's observability capabilities, ensuring they align with business objectives and developers' requirements.
  • Play a pivotal role in steering the observability roadmap and strategic direction, utilizing a comprehensive understanding of business context to influence key decisions and initiatives.
  • Design, develop, and deploy advanced tooling and systems that enhance the reliability, scalability, and efficiency of our platform.
  • Operate and improve our infrastructure using industry best practices and tools, setting standards for excellence.
  • Document infrastructure operations processes and insights, identify repeatable actions, and lead the automation of repetitive tasks.
  • Participate in our team's on-call rotations, respond to incidents, and provide expert support to other teams in mitigating customer-impacting events.

Experience:

  • 8+ years of experience in roles focused on software development, automation, and systems engineering, with a proven track record of technical leadership.
  • Bachelor's Degree or equivalent experience in Computer Science or a relevant discipline, with a strong foundation in observability principles.
  • Proven expertise in architecting and scaling observability infrastructure to support comprehensive monitoring and analysis in large production environments.
  • Advanced proficiency in creating production-ready code in high-level languages, such as Go, Python.
  • Extensive experience operating large-scale infrastructure in public cloud environments, such as AWS, and with Managed Services like Amazon OpenSearch Service and Amazon Managed Service for Prometheus.
  • Deep experience with Kubernetes and Envoy Proxy, managing multi-cluster environments in large-scale production settings.
  • Familiarity with distributed storage technologies such as S3, RDS, DynamoDB, Aurora, and distributed configuration systems such as Zookeeper and etcd.
  • Expertise in deploying and managing monitoring, alerting, and logging systems at massive-scale, such as Prometheus, Grafana, Kibana, Telegraph, and M3.

Benefits:

  • Extended health and dental coverage options, along with life insurance and disability benefits.
  • Mental health benefits.
  • Family building benefits.
  • Access to a Health Care Savings Account.
  • In addition to provincial observed holidays, team members get 15 days paid time off, with an additional day for each year of service.
  • 4 Floating Holidays each calendar year prorated based off of date of hire.
  • 10 paid sick days per year regardless of province.
  • 18 weeks of paid parental leave. Biological, adoptive, and foster parents are all eligible.

Lyft is an equal opportunity employer and welcomes applications from diverse candidates. We strive for a healthy and safe workplace and strictly prohibit harassment of any kind. Accommodation for persons with disabilities will be provided upon request in accordance with applicable law during the application and hiring process. Please contact your recruiter now if you wish to make such a request.

This role will be in-office on a hybrid schedule — Team Members will be expected to work in the office 3 days per week on Mondays, Thursdays and a team-specific third day. Additionally, hybrid roles have the flexibility to work from anywhere for up to 4 weeks per year. #Hybrid



  • Toronto, Ontario, Canada Lyft Full time

    About the RoleWe are seeking an experienced Infrastructure Engineer to join our Observability team at Lyft. As a key member of our team, you will be responsible for the operation and maintenance of our logging and metrics infrastructure. Your expertise will ensure that all teams at Lyft are aware of the operational health of their products by monitoring...


  • Toronto, Ontario, Canada TSX Inc. Full time

    TSX Inc. The TSX group of companies includes leading global exchanges such as the Toronto Stock Exchange, Montreal Exchange, and numerous innovative organizations enhancing capital markets. United as a global team, we're connecting cross-functionally, traversing industries and geographies, moving opportunity into action, advancing global economic growth,...


  • Greater Toronto Area, Canada, Ontario Environmental Infrastructure Solutions Inc. Full time

    Environmental Infrastructure Solutions Inc. (EIS) is a dynamic team dedicated to providing innovative engineering solutions, project management and construction management services for the water and wastewater municipal sector.We have an outgoing and professional work environment that encourages collaboration and continuous professional and personal growth....

  • Software Engineer

    4 weeks ago


    Toronto, Ontario, Canada Sun Life Financial Full time

    Job Title: Software Engineer - Cloud Infrastructure SpecialistAbout the Role:We are seeking an experienced Software Engineer to join our team as a Cloud Infrastructure Specialist. The successful candidate will be responsible for designing, building, and maintaining scalable cloud-based infrastructure solutions.Key Responsibilities:Design and implement...


  • Toronto, Ontario, Canada Infrastructure Ontario Full time

    About the RoleInfrastructure Ontario is seeking an experienced Project Manager to join our team as a Project Delivery Specialist. This is a challenging opportunity to lead and manage complex infrastructure projects from start to finish.Key ResponsibilitiesDevelop and implement project management plans, ensuring timely and within-budget delivery of...


  • Toronto, Ontario, Canada University of Toronto Full time

    About the Role:The University of Toronto is seeking a highly skilled Enterprise Virtualization and Storage Specialist to join our team. As a key member of our Enterprise Infrastructure Solutions group, you will be responsible for the design, development, and implementation of our virtualization and storage infrastructure.Key Responsibilities:Design and...


  • Toronto, Ontario, Canada Viva Tech Solutions Full time

    Job OverviewViva Tech Solutions is seeking an experienced Cloud Infrastructure Automation Specialist to join our team. As a key member of our engineering department, you will be responsible for designing, implementing, and maintaining scalable and secure cloud infrastructure solutions on Google Cloud Platform (GCP).About the RoleThis is a full-time position...


  • Toronto, Ontario, Canada Ampcus Incorporated Full time

    Mainframe Infrastructure Capacity Planning EngineerAmpcus Incorporated is seeking a highly skilled Mainframe Infrastructure Capacity Planning Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the observability, performance, and efficiency of our mainframe systems.Responsibilities:Implement and...


  • Toronto, Ontario, Canada Grafana Labs Full time

    About the RoleAs a Product Manager at Grafana Labs, you'll play a pivotal role in shaping the vision and strategy for our cloud-based observability solutions. This is an exciting opportunity to join a highly technical team and drive innovation in the field of distributed tracing.We're seeking an experienced Product Manager to lead our Distributed Tracing...

  • Software Engineer

    2 weeks ago


    Toronto, Ontario, Canada Kiehl's Since 1851 Full time

    We are seeking a highly skilled Software Engineer - Cloud Infrastructure to join our team at {company}. As a Cloud Infrastructure Specialist, you will be responsible for designing, building, and maintaining scalable and secure cloud-based systems. Your expertise will be crucial in ensuring the smooth operation of our cloud infrastructure, allowing us to...


  • Toronto, Ontario, Canada Experis Full time

    About the Role:We are seeking an experienced IT Infrastructure Specialist to join our Experis team. As a key member of our infrastructure engineering team, you will be responsible for designing, implementing, and managing infrastructure using HashiCorp tools such as Terraform, Vault, Consul, and Nomad.Key Responsibilities:Design and implement infrastructure...


  • Toronto, Ontario, Canada Broadridge Full time

    Job Title: Network Infrastructure SpecialistWe're seeking a skilled Network Infrastructure Specialist to join our team at Broadridge. As a key member of our infrastructure team, you will be responsible for designing, implementing, and managing our enterprise network infrastructure solutions.Key Responsibilities:Design and implement resilient, secure, and...


  • Toronto, Ontario, Canada Motion Recruitment Full time

    Job DescriptionAt Motion Recruitment, we are currently seeking a talented Solutions Engineer for Cloud Infrastructure to join our esteemed DevOps team.As a global fintech company located in GTA, we specialize in GCP cloud environments, utilizing Terraform, Kubernetes, and Ansible to manage and optimize infrastructure.In this pivotal role, you will have the...


  • Toronto, Ontario, Canada Parsons Corporation Full time

    About the Role:We are seeking a skilled Civil Infrastructure Specialist to enhance our linear and multi-discipline profile in Municipal Water/Wastewater/Stormwater Management infrastructure. The position will focus on the Greater Toronto Region to support our Mobility Solutions Team.As a Civil Infrastructure Specialist, you will work with our team to design,...


  • Toronto, Ontario, Canada Grafana Labs Full time

    We are seeking an experienced Solutions Architect to join our world-class Customer Experience team at Grafana Labs. As a key member of our ProServ team, you will be responsible for delivering exceptional solutions and services to our customers.About the RoleThis role requires a strong understanding of business and technical requirements, as well as excellent...


  • Toronto, Ontario, Canada Ripple Full time

    At Ripple, we're revolutionizing the way value moves. Our mission is to simplify access to instant decentralized payments by offering infrastructure, tools, and developer support for XRPL innovation. We're seeking a staff data engineer who shares our passion for this technology and has the technical expertise to drive our vision forward.The Work:Lead the...


  • Toronto, Ontario, Canada Mozilla Full time

    Staff Release Engineer - Cloud Infrastructure At Mozilla, our Release Engineering team (RelEng) plays a crucial role in ensuring the seamless and reliable delivery of our world-class software products. We pride ourselves on our collaborative and dynamic work environment. Our engineers are problem-solvers who thrive on tackling complex challenges and...

  • Project Manager

    2 weeks ago


    Toronto, Ontario, Canada Bantrel Full time

    Job OverviewBantrel is currently seeking a highly skilled Project Manager - Underground Infrastructure Specialist to join our team based out of Toronto, ON.The successful candidate will be responsible for leading Field Specialists in the construction engineering work across multiple disciplines of the project. This role involves providing engineering...


  • Toronto, Ontario, Canada Highbrow LLC Full time

    Job OverviewHighbrow LLC is seeking a highly skilled Cloud Deployment Engineer - Infrastructure Specialist to lead application deployments in a few technologies in single domain (e.g. IaaS automation on VMC/Terraform/Salt and/or OpenStack/Cloudify/Salt).About the RoleThis role involves leading code deployments end-to-end, across all environments, for an...


  • Toronto, Ontario, Canada Thomson Reuters Full time

    At Thomson Reuters, we're seeking a highly skilled Cloud Native DevOps Engineer to join our team in Toronto. This is an exciting opportunity to work on scalable infrastructure and drive innovation in the field of site reliability engineering.About the RoleIn this position, you will be responsible for implementing site reliability engineering and DevOps best...