Infrastructure Reliability Engineer

4 weeks ago


Fort Albany, Canada Braze Full time

About Braze

At Braze, we pride ourselves on our exceptional team, characterized by approachability, kindness, and a shared passion for excellence.

We aim to cultivate this enthusiasm by establishing high standards, promoting collaboration, and fostering a harmonious work-life balance as we navigate our rapid global expansion while striving for equity and opportunity both within and outside our organization.

To thrive in our environment, you must be ready to set ambitious goals for yourself and your colleagues. There is always an opportunity to contribute: exercising autonomy, embracing accountability, and welcoming diverse perspectives are vital to our ongoing success. Our insatiable curiosity and eagerness to share our varied interests enrich our culture and create a vibrant workplace.

If you are motivated to tackle exciting challenges and have a proactive approach to change, you will have the opportunity to make a significant impact here, supported by a dedicated and passionate team. If Braze resonates with your values, we look forward to connecting with you.

Role Overview

Site Reliability Engineers (SREs) play a crucial role in ensuring the seamless operation of all internal services and platforms. In essence, SREs are tasked with maintaining site availability. They combine the skills of adept system administrators and software engineers, applying robust engineering principles, operational discipline, and advanced automation to the environments and infrastructure services we provide. Our expertise spans systems, whether it involves networking, the Linux kernel, or specialized knowledge in scaling algorithms or distributed systems.

Our team is dedicated to enhancing automation, infrastructure reliability, and empowering Braze's engineering teams to effectively utilize the infrastructure products and platforms we develop. Operating at a significant scale, we support over 3.3 billion monthly active users across our clients, processing hundreds of billions of data points each month and delivering billions of messages to end-users daily. Our diverse technology stack includes Ruby on Rails, MongoDB, Redis, Kafka, Kubernetes, and more. As a Site Reliability Engineer at Braze, you will collaborate with your team and consumer engineering teams to continuously enhance the infrastructure, automation, and tools that support internal products built on these technologies.

Key Responsibilities

  • Collaborate with Braze's engineering teams to:
  • Design products that effectively leverage infrastructure platforms in a scalable and reliable manner.
  • Diagnose reliability and scalability challenges across all layers of the stack, including products developed using our infrastructure platforms.
  • Create monitoring and alerting systems that focus on symptoms rather than outages.
  • Ensure compliance with our stringent enterprise-grade SLAs for customers.
  • Advance Braze's internal platform infrastructure by:
  • Implementing Infrastructure as Code using Chef, Terraform, and Kubernetes.
  • Developing deployment pipelines for applications across multiple languages using Docker, Kubernetes, etc.
  • Providing centralized tools, services, and automation frameworks that are essential for scaling operations, managing capacity, alleviating operational challenges, and enhancing the daily workflow of Braze's engineering teams.
  • Manage incidents by:
  • Participating in a PagerDuty rotation to address availability incidents and support fellow engineers.
  • Utilizing your on-call shifts to prevent incidents proactively.
  • Conducting retrospectives on incidents to transform lessons learned into system enhancements, automation, etc.

Candidate Profile

  • 3+ years of experience as a Software, DevOps, or Site Reliability Engineer.
  • Possess a systems-oriented mindset, considering interfaces, boundaries, edge cases, failure modes, behaviors, and specific implementations.
  • Exhibit a collaborative spirit, with a focus on documentation and swift delivery.
  • Work effectively with global remote teams, often in asynchronous settings.
  • Document processes to avoid redundancy in learning and planning.
  • Deliver promptly to exceed customer expectations, including internal stakeholders.
  • Demonstrate a proactive attitude; when encountering issues, you are driven to resolve them.
  • Possess a desire to address everyday challenges faced by software engineers and automate repetitive tasks.
  • Exhibit strong multitasking abilities and manage various expectations simultaneously.
  • Proficient in Linux and Unix Shell.
  • Strong programming skills, preferably in Ruby and/or Go.
  • Experience with Docker, Kubernetes, Terraform, or similar Infrastructure as Code technologies.
  • Familiarity with MongoDB, Redis, Kafka, Postgres, or similar data technologies.

What We Offer

Details regarding our benefits package will be shared upon receiving an employment offer. Benefits may vary by location.

We provide comprehensive benefits and foster flexible work environments, ensuring you can prioritize work-life harmony.

  • Competitive compensation, which may include equity.
  • Retirement and Employee Stock Purchase Plans.
  • Flexible paid time off.
  • Comprehensive benefits covering medical, dental, vision, life, and disability.
  • Family services, including fertility benefits and equal paid parental leave.
  • Professional development supported by structured career paths, learning platforms, and tuition reimbursement.
  • Opportunities for community engagement throughout the year, including an annual company-wide Volunteer Week.
  • Employee Resource Groups that foster supportive communities within Braze.
  • A collaborative, transparent, and enjoyable culture recognized as a Great Place to Work.

Equal Opportunity Employer

At Braze, we are committed to creating equitable growth and opportunities both within and outside the organization.

Building meaningful connections is central to our mission, including our recruitment practices. We strive to provide all candidates with a fair, accessible, and inclusive experience, regardless of age, color, disability, gender identity, marital status, maternity, national origin, pregnancy, race, religion, sex, sexual orientation, or status as a protected veteran. We encourage you to showcase your unique qualities during the application and interview process.

We understand that various circumstances may lead talented individuals to hesitate in applying for a role unless they meet all criteria. If this resonates with you, we encourage you to apply, as we would love to meet you.

For more information on how Braze processes your personal information during the recruitment process and your privacy rights, please refer to our privacy policy.



  • Fort Albany, Canada Braze Full time

    About BrazeBraze is a premier customer engagement platform that fosters enduring connections between consumers and their favorite brands. Our platform empowers marketers to gather and act on vast amounts of data from diverse sources, enabling real-time engagement with customers across multiple channels.Role OverviewAs a Site Reliability Engineer (SRE) at...


  • Fort Albany, Canada Braze Full time

    About UsAt Braze, we pride ourselves on our exceptional team culture, where kindness, approachability, and passion are at the forefront of our mission.We are committed to fostering an environment that promotes high standards, teamwork, and a healthy work-life balance as we grow globally while striving for equity and opportunity within and beyond our...


  • Fort Albany, Canada Braze Full time

    At Braze, we have found our people. We’re a genuinely approachable, exceptionally kind, and intensely passionate crew.We seek to ignite that passion by setting high standards, championing teamwork, and creating work-life harmony as we collectively navigate rapid growth on a global scale while striving for greater equity and opportunity – inside and...


  • Fort Albany, Canada Braze Full time

    About the RoleBraze is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our internal-facing services and platforms.Key ResponsibilitiesPartner with Braze's engineering teams to architect products that effectively utilize infrastructure...


  • Fort Albany, Canada Braze Full time

    About the RoleBraze is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our internal-facing services and platforms.Key ResponsibilitiesPartner with Braze's engineering teams to architect products that effectively utilize infrastructure...


  • Fort Albany, Canada Braze Full time

    About the RoleBraze is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our internal-facing services and platforms.Key ResponsibilitiesPartner with Braze's engineering teams to architect products that effectively utilize...


  • Fort Albany, Canada Braze Full time

    About the RoleBraze is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our internal-facing services and platforms.Key ResponsibilitiesPartner with Braze's engineering teams to architect products that effectively utilize...


  • Fort Albany, Canada Braze Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Braze. As a Site Reliability Engineer, you will play a critical role in ensuring the smooth operation of our internal-facing services and platforms.Key ResponsibilitiesPartner with our engineering teams to architect products that effectively utilize infrastructure...


  • Fort Albany, Canada Braze Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Braze. As a Site Reliability Engineer, you will play a critical role in ensuring the smooth operation of our internal-facing services and platforms.Key ResponsibilitiesPartner with our engineering teams to architect products that effectively utilize infrastructure...


  • Fort Albany, Canada Braze Full time

    About the RoleWe are seeking a highly skilled Senior Software Engineer to join our Messaging Components team at Braze. As a key member of our engineering team, you will be responsible for designing, developing, and maintaining critical components of our high-scale infrastructure.Key ResponsibilitiesDesign and develop scalable, efficient, and reliable...


  • Fort Albany, Canada Braze Full time

    About the RoleWe are seeking a highly skilled Senior Software Engineer to join our Messaging Components team at Braze. As a key member of our engineering team, you will be responsible for designing, developing, and maintaining critical components of our high-scale infrastructure.Key ResponsibilitiesDesign and develop scalable, efficient, and reliable...

  • Reliability Engineer

    2 weeks ago


    Fort McMurray, Canada Weir Full time

    Reliability Engineer Weir Minerals Canada Fort McMurray, AB Onsite Purpose of Role: The Reliability Engineer supports business through effective troubleshooting, analysis, and solution proposal to ensure reliability of Weir equipment. They should possess superior trouble shooting and failure mode analysis along with knowledge of technical...

  • Reliability Engineer

    4 weeks ago


    Fort McMurray, Canada Finning Full time

    Description : Investigation of and reporting on equipment failures including field research and data collection within Finning and Caterpillar. Analyze components' performance: life, durability/reliability and history. Manage service letters and updates for equipment on-site. Analyze and report equipment performance in terms of availability and...


  • Fort Albany, Canada Braze Full time

    About UsAt Braze, we pride ourselves on our exceptional team culture, characterized by approachability, kindness, and a shared passion for our work.We aim to foster this enthusiasm by establishing high expectations, promoting collaboration, and ensuring a balance between work and personal life as we navigate our rapid global expansion while advocating for...


  • Fort Albany, Canada Braze Full time

    About UsAt Braze, we pride ourselves on our exceptional team. Our culture is built on approachability, kindness, and a shared passion for excellence.We aim to foster this enthusiasm by establishing high expectations, promoting collaboration, and ensuring a healthy work-life balance as we navigate our rapid global expansion while advocating for equity and...


  • Fort McMurray, Canada TWD Full time

    Mechanical Maintenance Engineer TWD is a consulting firm specializing in engineering, procurement, and construction management, dedicated to delivering project development and execution services tailored for the oil and gas sector. Our expertise encompasses a wide range of areas including refining, pipeline operations, terminal management, blending...


  • Fort McMurray, Canada TWD Full time

    Mechanical Maintenance Engineer TWD is a leading consulting firm specializing in engineering, procurement, and construction management, dedicated to delivering project development and execution services tailored for the oil and gas sector. Our expertise spans a wide range of areas including refineries, pipelines, terminal operations, blending, renewable...


  • Fort Albany, Canada Braze Full time

    About UsAt Braze, we pride ourselves on our exceptional team. Our culture is built on approachability, kindness, and a shared passion for our work.We aim to foster this enthusiasm by maintaining high standards, promoting collaboration, and ensuring a healthy work-life balance as we navigate our rapid global expansion while advocating for equity and...

  • Engineering Student

    4 weeks ago


    Fort McMurray, Canada Canadian Natural Resources Limited (CNRL) Full time

    The Opportunity JOB DESCRIPTION Join our team at one of our Oil Sands operations and dive into the world of Plant Reliability Engineering! As part of our plant operations reliability team, you'll work on critical assets like conveyor belts, pumps, and compressors, conducting failure analysis investigations and refining equipment strategies for long-term...

  • Engineering Student

    4 weeks ago


    Fort McMurray, Canada Canadian Natural Resources Limited (CNRL) Full time

    The Opportunity JOB DESCRIPTION Join our team at one of our Oil Sands operations and dive into the world of Plant Reliability Engineering! As part of our plant operations reliability team, you'll work on critical assets like conveyor belts, pumps, and compressors, conducting failure analysis investigations and refining equipment strategies for long-term...