Site Reliability Engineer

1 month ago


Toronto, Ontario, Canada Braze Full time

Site Reliability Engineers (SREs) are responsible for keeping all internal-facing services and platforms running smoothly. In a nutshell, SREs ensure site uptime. SREs blend sensible system administrators and software engineers who apply sound engineering principles, operational discipline, and mature automation to the environments and infrastructure services we provide. We specialize in systems–whether it be networking, the Linux kernel, or some more specific interest in scaling–algorithms or distributed systems.

Our team helps to improve automation, infrastructure reliability, and empowers Braze's other engineering teams to leverage the infrastructure products and platforms we create easily. Braze operates at a massive scale with over 3.3 billion monthly active users across our customers, collecting hundreds of billions of data points each month, and sending billions of messages to end-users daily. We use a diverse technology stack rooted in Ruby on Rails, MongoDB, Redis, Kafka, Kubernetes, and more. As a Site Reliability Engineer at Braze, you will collaborate with your team and consumer engineering teams to continuously improve the infrastructure, automation, and tooling that build internal products from these technologies.

WHAT YOU'LL DO

Partner with Braze's engineering teams on:

Architecting products to effectively utilize infrastructure platforms in a scalable, reliable manner
Debugging reliability and scalability issues across all stack layers, including the products built using our infrastructure platforms
Make monitoring and alerting alerts on symptoms and not on outages
Ensure that Braze meets our strict enterprise-grade SLAs with customers

Develop Braze's internal platform infrastructure:

Create Infrastructure as code using Chef, Terraform, and Kubernetes
Develop deployment pipelines for applications in multiple languages using Docker, Kubernetes, etc.
Provide centralized/common tooling, services, and automation frameworks that are critical for scaling operations, capacity management, reducing operational pain, and improving the day-to-day workflow of Braze's engineering teams

Manage incidents:

Be on a PagerDuty rotation to respond to availability incidents and provide support for other engineers
Use your on-call shift to prevent incidents from ever happening
Retrospect everything that happens to turn lessons into system improvements/changes, automation, etc.

WHO YOU ARE

3+ years of experience as a Software, DevOps, or Site Reliability Engineer
You think about systems - interfaces, boundaries, edge cases, failure modes, behaviors, specific implementations
Have an urge to collaborate, document, and deliver quickly

Collaborating across the global remote teams, often working asynchronously
Document everything so you don't need to learn the same thing (or plan the same work) twice
Delivering fast to delight our customers–even internal ones

Have an enthusiastic, go-for-it attitude. When you see something broken, you can't help but fix it
Have a desire to solve everyday challenges facing software engineers and automate their toil away
Have an excellent ability to manage multiple tasks and expectations at once
Know your way around Linux and Unix Shell
Have strong programming skills - Ruby and/or Go preferred
Have experience with Docker, Kubernetes, Terraform, or similar IaC technologies
Have experience with MongoDB, Redis, Kafka, Postgres, or similar data technologies


  • Toronto, Ontario, Canada Bold Commerce Full time

    Salary: Who is Bold Commerce?Bold Commerce powers personalized checkout experiences for leading omnichannel retailers and direct-to-consumer brands.As a leader in the composable commerce space, Bold makes checkout better, boosting profitability by enabling personalized, customer-specific checkout flows designed to increase the Checkout Power Trio of...


  • Toronto, Ontario, Canada Vaco Full time

    Job PostingAbout the CompanyOur client operates global markets and builds digital communities and analytic solutions and is looking to hire a Site Reliability EngineerAbout the OpportunityStephen manages the infra group team, Windows, virtualization, IT infrastructure, etc. Works closely with Jeremy who is the hiring manager away for Pat leave. They are...


  • Toronto, Ontario, Canada Red Hat Full time

    About the JobRed Hat is seeking a Senior Site Reliability Engineer (SRE) to develop, scale, and operate our OpenShift managed cloud services. OpenShift is Red Hat's enterprise Kubernetes distribution. As an SRE you will contribute to running OpenShift at scale by enabling customer self-service, making our monitoring system more sustainable, and eliminating...


  • Toronto, Ontario, Canada Vale Base Metals Full time

    Job Title: Sr. Engineer ReliabilityWant to work with leading technology? Who We Are: Welcome to Vale. Our purpose is to improve life and transform the future. Together. We value our workforce and strive to offer continuous training and career development opportunities for our people.Vale Base Metals is one of the world's largest producers of high-quality...


  • Toronto, Ontario, Canada Forhyre Full time

    We are looking for someone that is generalist at heart, one who is curious, appreciates complexity, knows or wants to learn when to step back and when to dive deep. We call this role a Cloud Service Reliability Engineer. The Cloud Service Reliability Engineer will be responsible for effective design, execution, and maintenance of systems implemented on...

  • Engineer

    1 month ago


    Toronto, Ontario, Canada Toronto Hydro Full time

    Position: Electrical Engineer at Toronto HydroReporting to the Manager, Engineering, the Engineer provides engineering support services ensuring technical soundness, reliability, safety, and cost-effectiveness of the Electrical Distribution Power System. The Engineer is legally responsible for personal work and that of others. In addition to Substation...

  • Engineer

    2 months ago


    Toronto, Ontario, Canada Toronto Hydro Corporation Full time

    As an Engineer in Grid Operations, you provide a wide range of engineering support services that will ensure technical soundness, reliability, safety and cost effectiveness of the utility and its Electrical Distribution Power System. The Engineer develops specifications; participates in short and long range contingency planning; performs operational design...

  • Site Superintendent

    2 weeks ago


    Toronto, Ontario, Ontario, Canada Webuild Full time

    About Us:Webuild is an international construction company of civil engineering pioneers who have been at the forefront of the construction business for 120 years. We are a global player with Italian roots specializing in complex infrastructure: innovative and sustainable works that improve the lives of people. In over a century, we built some of the...

  • Site Manager

    1 month ago


    Toronto, Ontario, Canada GE Renewable Energy Full time

    Job Description SummaryThe Site Manager, who reports to the construction manager, will be responsible for the on-site execution of hydraulic turbine and alternator projects. Responsibilities include, but are not limited to, EHS planning, execution strategy, scheduling, cost tracking and site resource planning. In this role, you will work within defined...


  • Toronto, Ontario, Canada Flynn Canada Ltd Full time

    Industrial Manufacturing EngineerFlynn Manufacturing DivisionToronto, ONFlynn Manufacturing is seeking a highly skilled and experienced Industrial Manufacturing Engineer to join our dynamic team. The successful candidate will be responsible for optimizing our manufacturing processes, improving productivity, and ensuring the efficient use of resources. This...

  • Engineer

    1 month ago


    Toronto, Ontario, Canada Toronto Hydro Full time

    WORK ILLUSTRATION: Reporting to the Manager, Engineering, the Engineer provides a wide range of engineering support services that will ensure technical soundness, reliability, safety and cost effectiveness of the utility and its Electrical Distribution Power System. The Engineer is accountable and legally responsible for personal drawings, calculations,...


  • Toronto, Ontario, Canada Microsoft Canada Full time

    OverviewMicrosoft Cloud Operations and Innovation (CO&I) is the team behind the cloud. Within CO&I, the Datacenter Engineering (DCE) team is responsible for delivering core datacenter infrastructure for Microsoft's cloud business. The Microsoft portfolio consists of complex, multi-disciplinary, large scale, multi-year datacenter construction projects. We are...


  • Toronto, Ontario, Canada J.S. Held Full time

    Salary: The CompanyAre you looking to join an organization that is growing and dynamic? What about a high-energy, collaborative environment that rewards hard work?J.S. Held is a global consulting firm that combines technical, scientific, financial, and strategic expertise to advise clients seeking to realize value and mitigate risk. Our professionals serve...


  • Toronto, Ontario, Canada Design Works Engineering Full time

    Salary: Senior Electrical Engineer Toronto, ON Hello and welcome to Design Works Engineering We are a multi-discipline engineering firm inclusive of civil engineering, structural engineering, mechanical engineering, electrical engineering, energy modelling, and fire protection design. Our diverse staff all share the same vision – create great projects,...


  • Toronto, Ontario, Canada GE Renewable Energy Full time

    Job Description SummaryGE Renewable Energy's portfolio of solutions for hydropower generation includes the broadest range of hydro solutions and services: from water to wire, from individual equipment to complete turnkey solutions, for new plants and the installed base. At GE, we believe that the combination of our extensive hydro and digital intelligence...

  • Site Supervisor

    4 days ago


    Greater Toronto Area, Canada, Ontario The Mirillion Group Full time

    My client is seeking a full-time Construction Site Supervisor for office interior fit out/renovation projects. The site supervisor is responsible for overseeing and managing all aspects of a construction site. This will me managing multiple projects within the Greater Toronto Area, with a strong responsibility to lead teams and mentor junior staff, whilst...


  • Toronto, Ontario, Canada PROTEINQURE INC. Full time

    Senior Software EngineerAt ProteinQure, we are building a computational platform for the design of peptide therapeutics. By daring to deliver therapeutics in a novel way we are changing the game for drug development and bringing hope to patients with previously untreatable diseases. We work on treatments for cancer, diabetes, neurodegenerative, and...

  • Engineer

    4 days ago


    Toronto, Ontario, Canada Toronto Hydro Corporation Full time

    As an Engineer at Toronto Hydro, you will play a pivotal role in developing and implementing innovative solutions across various aspects of utility operations. Collaborating with cross-functional teams, you will have the opportunity to contribute to the design, construction, and maintenance of critical infrastructure and systems. You'll work on projects...

  • Structural Engineer

    2 months ago


    Toronto, Ontario, Canada Englobe Corp. Full time

    Your EmployerDare to join Englobe At nearly 3,000 people, Englobe is one of Canada's premier firms specializing in professional engineering services, environmental sciences, and soil and biomass treatment. With offices located across Canada, the United Kingdom and France, we are conveniently located to support large- and small-scale projects, through...

  • TBM Engineer

    2 months ago


    Toronto, Ontario, Canada Ontario Transit Group Full time

    Company DescriptionFerrovial Construction Canada Inc. and VINCI Construction Grands Projects are undertaking the design, build, and finance the Ontario Line Southern Civil, Stations, and Tunnel (South Civil) package.As Ontario Transit Group, we are now mobilizing our design and construction crews, with major works. The South Civil contract is anticipated to...