Platform Observability

3 hours ago


Canada Elastic Full time

Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter. By taking advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud‑based solutions for search, security, and observability help organizations deliver on the promise of AI. What Is The Role The Platform Observability Team provides critical, scalable, and efficient observability processes and tooling for Elastic's internal developers and engineering teams. We operate 200+ deployments for logs, metrics, and traces, including regional logging, metrics, and monitoring clusters. We build self‑service tooling that empowers engineers to instrument, monitor, and troubleshoot their services independently. We're part of Platform Infrastructure, the organization responsible for providing the runtime for all Elastic Cloud‑powered products. We are builders at heart. We need a Senior Engineering Manager who puts people first but still loves diving into the technical details of observability and distributed systems. You will lead a globally distributed team of 7 SREs across EMEA, Americas, and APJ, ensuring our internal customers have the tools they need to build reliable services at scale. You’ll own SLA monitoring infrastructure for Elastic Cloud (ESS and Serverless) and drive adoption of Elastic’s own observability stack across the organization. We value leaders who embrace our SRE culture: go slow to go fast, own problems end‑to‑end, make sound and timely decisions, and create amazing experiences for both internal and external customers. What You Will Be Doing People & Talent Management: mentor and lead a globally distributed team of SREs, fostering a culture of ownership, psychological safety, and continuous improvement. Manage the full employee lifecycle, from hiring top talent to helping team members reach promotion and creating clear career paths. Strategy & Execution: transition observability needs into platform capabilities by partnering with engineering teams and Platform SRE leadership to understand requirements, build roadmaps, and translate them into clear deliverables. Drive adoption of observability best practices and ensure our platforms meet the needs of internal customers. Technical & Operational Leadership: hold full accountability for our platforms. Partner with your team to facilitate technical discussions, navigate trade‑offs, and drive delivery of high‑quality observability solutions. Champion reliability improvements, incident management processes, and blameless post‑mortems, keeping our platforms production‑ready. What You Bring Management Experience: 3+ years leading technical teams, with a focus on mentoring and talent development. Technical Foundation: 5+ years in SRE, DevOps, or infrastructure engineering, with enough depth to understand the work and guide senior engineers through complex problems. Observability Expertise: strong understanding of observability principles—metrics, logs, traces, and APM. Experience defining and tracking SLIs, SLOs, and error budgets. SaaS at Scale: previous success supporting high‑scale, multi‑tenant global platforms. Distributed Systems Background: experience operating and scaling systems in cloud environments (AWS, GCP, or Azure) and improving reliability. Distributed Leadership: experience managing geographically distributed teams across multiple time zones and cultures. Operational Rigor: experience with incident management, on‑call processes, and post‑mortem practices. Infrastructure as Code: familiarity with Kubernetes, Terraform, and GitOps practices. Communication: a knack for translating technical strategy and progress for all audiences, technical or not. Strong written communication is important. Bonus Points Elastic Stack Expertise: experience with Elastic Observability, Elasticsearch, Kibana, Beats, and/or APM. Platform Engineering Background: experience building internal developer platforms or self‑service tooling. You enjoy working with a distributed company and the active, asynchronous communication it requires. You love a diverse environment, working with people all over the world. You believe a diverse company is a better company. You are willing to listen and give everyone at the table a voice. Additional Information – We Take Care of Our People As a distributed company, diversity drives our identity. Whether you’re looking to launch a new career or grow an existing one, Elastic is the type of company where you can balance great work with great life. Your age is only a number. It doesn’t matter if you’re just out of college or your children are; we need you for what you can do. We strive to have parity of benefits across regions, and while regulations differ from place to place, we believe taking care of our people is the right thing to do. Competitive pay based on the work you do here and not your previous salary. Health coverage for you and your family in many locations. Ability to craft your calendar with flexible locations and schedules for many roles. Generous number of vacation days each year. Increase your impact – We match up to $2000 (or local currency equivalent) for financial donations and service. Up to 40 hours each year to use toward volunteer projects you love. Embracing parenthood with a minimum of 16 weeks of parental leave. Different people approach problems differently. We need that. Elastic is an equal opportunity employer and is committed to creating an inclusive culture that celebrates different perspectives, experiences, and backgrounds. Qualified applicants will receive consideration for employment without regard to race, ethnicity, color, religion, sex, pregnancy, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, disability status, or any other basis protected by federal, state or local law, ordinance or regulation. We welcome individuals with disabilities and strive to create an accessible and inclusive experience for all individuals. To request an accommodation during the application or the recruiting process, please email We will reply to your request within 24 business hours of submission. Applicants have rights under Federal Employment Laws and can view the following posters linked below: Family and Medical Leave Act (FMLA) Poster; Equal Employment Opportunity (EEO) Poster; and Employee Polygraph Protection Act (EPPA) Poster. Please see here for our Privacy Statement. #J-18808-Ljbffr



  • , , Canada Elastic Full time

    A leading tech company in Canada is seeking a Senior Engineering Manager to lead their Platform Observability Team. You will manage a globally distributed team and oversee SLA monitoring infrastructure for Elastic Cloud. The ideal candidate will have 5+ years in SRE or DevOps, strong experience with observability principles, and a passion for mentoring. This...


  • , , Canada ClickHouse Full time

    Cloud Software Engineer - Observability Platform Apply for the Cloud Software Engineer - Observability Platform role at ClickHouse. About ClickHouse Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast‑growing private cloud companies. With over 2,000 customers and an ARR that has more than quadrupled over the past...


  • , , Canada PowerToFly Full time

    Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the...


  • , , Canada ClickHouse Full time

    A leading data analytics company is seeking a Cloud Software Engineer for the Observability Platform. The role involves designing, building, and maintaining systems for internal monitoring and customer observability. Candidates should have over 5 years of experience, strong skills in Golang, and familiarity with Kubernetes. Flexible work options are...


  • , , Canada Confluent Full time

    A leading data streaming platform company is seeking a remote Senior Software Engineer II specializing in observability. This role involves designing and maintaining critical observability infrastructure, ensuring high reliability, and collaborating with engineering teams. Candidates should have over 5 years of experience in distributed systems and...


  • , , Canada PowerToFly Full time

    A leading technology company seeks a Senior Engineering Manager to lead their globally distributed SRE team. The role involves mentoring engineers and turning observability needs into deliverables. Candidates should have substantial leadership experience in technical environments, along with a strong understanding of observability practices and...


  • , , Canada Confluent Full time

    Senior Software Engineer II - Observability (Remote - Canada) Join to apply for the Senior Software Engineer II - Observability (Remote - Canada) role at Confluent We’re not just building better tech. We’re rewriting how data moves and what the world can do with it. With Confluent, data doesn’t sit still. Our platform puts information in motion,...

  • Platform Engineer

    5 days ago


    , , Canada Deck Software Full time

    About Deck Deck is building the data infrastructure for the internet. We make scattered, login‑protected data instantly accessible through clean APIs and integrations—empowering businesses to act fast and smart, with no friction. We’re a team of builders from top‑tier tech companies who believe one thing: great ideas need great data . If you thrive...

  • Platform Engineer

    4 days ago


    , , Canada Deck Full time

    Join to apply for the Platform Engineer role at Deck About Deck Deck is building the data infrastructure for the internet. We make scattered, login‑protected data instantly accessible through clean APIs and integrations—empowering businesses to act fast and smart, with no friction. We’re a team of builders from top‑tier tech companies who believe one...

  • Platform Engineer

    1 day ago


    Canada deck Software Full time

    About DeckDeck is building the data infrastructure for the internet. We make scattered, login-protected data instantly accessible through clean APIs and integrations—empowering businesses to act fast and smart, with no friction.We're a team of builders from top-tier tech companies who believe one thing: great ideas need great data. If you thrive in...