Reliability Engineering Specialist

3 weeks ago


Old Toronto, Canada Robinhood Full time
About the Role

We're seeking a skilled Software Developer to join our Reliability Engineering team at Robinhood. As part of this team, you'll play a crucial role in designing, evolving, and maintaining large-scale distributed systems.

The team is focused on building robust, scalable systems that ensure high availability and low latency. Our primary areas of focus include developing a company-wide software system for tracking outages/SEVs and monitoring critical workflows for the business.

In this role, you'll combine your software and systems knowledge to engineer distributed systems that are reliable, scaleable, and fault-tolerant for Robinhood. You'll work closely with other infrastructure teams to achieve this goal.

Our technology stack primarily consists of Python/Go and container orchestration technologies such as Kubernetes. We also utilize microservice-oriented architectures and related OSS technologies like Kafka, Celery/RabbitMQ, nginx, Redis, Postgres, Airflow, and Consul. Our systems are built within AWS.

Responsibilities
  • Design and implement new features and services with a focus on high availability, low latency, and scalability.
  • Continually optimize systems and workflows by improving architecture, infrastructure, automation, CI/CD, and observability.
  • Act as an owner and leader of Robinhood's infrastructure by ensuring project infrastructure needs are met and working proactively with customer teams to help them improve reliability.
Requirements
  • Fluent in one or more programming languages (e.g., Go, Python, Java).
  • Experience authoring and operating high-scale services.
  • Experience with scalable distributed systems, either built from scratch or on public cloud primitives.
  • Plus points if you have experience with Python/Django/Go and AWS.

We're committed to providing an inclusive and welcoming interview experience for all candidates. If you need additional assistance throughout the process due to a physical or mental condition, or if there's something our team can do to enable a more accessible experience, please notify us in advance.



  • Toronto, Ontario, Canada SGS Full time

    **Job Title:** Reliability Engineering SpecialistAt SGS, we are seeking a skilled Reliability Engineering Specialist to join our team. This role plays a critical part in ensuring the reliability, supportability, scalability, and performance of our .NET stack applications built with MVC, Angular, and Web API.As a key member of our team, you will partner with...


  • Old Toronto, Canada Lorien Full time

    Hybrid - Manchester We are currently working with a leading gambling company dedicated to providing exceptional gaming experiences. They are looking for an experienced Site Reliability Engineer with a strong skill set in system reliability to join its world-class technology team. This role is ideal for someone who has 4+ years of experience within the...


  • Toronto, Ontario, Canada The Engineering Institute of Canada Full time

    Job SummaryAs a Senior Technical Specialist, Equipment Reliability, you will play a key role in developing and maintaining a deep technical understanding of our insured's businesses to enable world-class insurance engineering services. Your expertise in rotating equipment, specifically prime movers for power generation, will be highly...


  • Old Toronto, Canada TD Bank Full time

    Site Reliability Engineer Site Reliability Engineer Work Location: Canada Hours: 37.5 Line of Business: Technology Solutions Pay Details: We’re committed to providing fair and equitable compensation to all our colleagues. As a candidate, we encourage you to have an open dialogue with a member of


  • Old Toronto, Canada Chelsea Avondale Full time

    Chelsea Avondale is the world’s most cutting-edge home insurance group. We have developed sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company. Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our group...


  • Toronto, Ontario, Canada Criteo Full time

    About the Role:This is a challenging opportunity for an experienced engineer to join Criteo's PRE team as a Site Reliability Engineer. The role involves working closely with product engineering to improve the reliability of our apps, systems, and pipelines, assessing where optimization is needed most, and telling stories with meaningful monitoring.Key...


  • Old Toronto, Canada TD Bank Full time

    div>Site Reliability EngineerSite Reliability EngineerWork Location: CanadaHours: 37.5Line of Business: Technology SolutionsPay Details: We’re committed to providing fair and equitable compensation to all our colleagues. p>Job Description:CUSTOMERProvide technical leadership to improve the design and operation of systems in alignment to reliability...


  • Old Toronto, Canada Street Context Full time

    Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale ? Do you command with a steady hand when incidents unfold? Are you motivated by team success ? If so, continue reading… We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street...


  • Old Toronto, Canada Ascend Fundraising Solutions Full time

    We are seeking a skilled Cloud Reliability Engineer to collaborate with our IT team in Toronto. In this role, you will work closely with the client services team to diagnose, troubleshoot, and resolve system reliability issues.Responsibilities:Take ownership of customer-reported issues and drive them to resolution.Develop proactive measures to prevent...


  • Old Toronto, Canada Lorien Full time

    p>Hybrid - ManchesterWe are currently working with a leading gambling company dedicated to providing exceptional gaming experiences. They are looking for an experienced Site Reliability Engineer with a strong skill set in system reliability to join its world-class technology team. This role is ideal for someone who has 4+ years of experience within the...


  • Old Toronto, Canada CentML Full time

    At CentML, we are seeking a talented Site Reliability Engineer - Automation to join our team.We have a strong founding team that includes experts in AI, compilers, and ML hardware. Our co-founder and CEO, Gennady Pekhimenko, is a world-renowned expert in ML systems who has received multiple academic and industry research awards from top tech companies.As a...


  • Old Toronto, Canada Aversan Inc Full time

    Hardware Design Reliability Engineer North York, Ontario Position Summary Responsible for the hardware reliability activities regarding the hardware products within Engineering perimeter. Essential Functions / Key Areas of Responsibility Monitor the hardware reliability of the hardware systems in the field. Maintain a table with all the hardware returns...


  • Old Toronto, Canada Sentry Full time

    About the role The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance, and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1,000,000+ developers.


  • Old Toronto, Canada Street Context Full time

    p>Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale? p>We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street Context. We provide a premium Email, Analytics and Broker Relationship platform, purpose-built for capital markets and...


  • Old Toronto, Canada The Home Depot Canada Full time

    About The JobAs a Cloud Reliability Engineer Lead at The Home Depot Canada, you will play a crucial role in ensuring the reliability, performance, and operational support of our eCommerce systems.Job OverviewThis position requires a strong background in reliability reviews, performance engineering practices, production engineering, and operational support,...


  • Old Toronto, Canada Street Context Full time

    p>Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale? p>We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street Context. We provide a premium Email, Analytics and Broker Relationship platform, purpose-built for capital markets and...


  • Old Toronto, Canada Soda Full time

    Job Description Job Title: Site Reliability Engineer Location: Poland - Fully Remote Salary: 324K PLN or 27.3K monthly Start: ASAP Stack: AWS, Docker, Kubernetes, Terraform, Jenkins, Ansible, Linux, JavaScript, and Lambda. Are you a seasoned DevOps/SRE professional passionate about building high-performance, scalable systems? I am working with a Media/IT...


  • Old Toronto, Canada Olx Full time

    p>Site Reliability EngineerRemote Poland, PolandOLX – Engineering / Full-time / Remote At OLX, we work together to build a more sustainable world through trade. We make it safe, smart, and convenient to buy and sell cars, find housing, get jobs, buy and sell household goods, and more. Our colleagues around the world help to serve millions of people around...


  • Old Toronto, Canada Sentry Full time

    p>The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance, and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1,000,000+ developers. Sentry receives over a billion events a day and processes terabytes of...


  • Toronto, Ontario, Canada Riverside Natural Foods Full time

    Company OverviewRiverside Natural Foods is a forward-thinking company that prioritizes innovation, sustainability, and employee well-being. Our mission is to create delicious, nutritious snacks that are good for our customers, the planet, and our employees.Salary and BenefitsWe offer a competitive salary range of $55,000 - $65,000 per year, depending on...