Reliability Engineering Specialist
3 weeks ago
We're seeking a skilled Software Developer to join our Reliability Engineering team at Robinhood. As part of this team, you'll play a crucial role in designing, evolving, and maintaining large-scale distributed systems.
The team is focused on building robust, scalable systems that ensure high availability and low latency. Our primary areas of focus include developing a company-wide software system for tracking outages/SEVs and monitoring critical workflows for the business.
In this role, you'll combine your software and systems knowledge to engineer distributed systems that are reliable, scaleable, and fault-tolerant for Robinhood. You'll work closely with other infrastructure teams to achieve this goal.
Our technology stack primarily consists of Python/Go and container orchestration technologies such as Kubernetes. We also utilize microservice-oriented architectures and related OSS technologies like Kafka, Celery/RabbitMQ, nginx, Redis, Postgres, Airflow, and Consul. Our systems are built within AWS.
Responsibilities- Design and implement new features and services with a focus on high availability, low latency, and scalability.
- Continually optimize systems and workflows by improving architecture, infrastructure, automation, CI/CD, and observability.
- Act as an owner and leader of Robinhood's infrastructure by ensuring project infrastructure needs are met and working proactively with customer teams to help them improve reliability.
- Fluent in one or more programming languages (e.g., Go, Python, Java).
- Experience authoring and operating high-scale services.
- Experience with scalable distributed systems, either built from scratch or on public cloud primitives.
- Plus points if you have experience with Python/Django/Go and AWS.
We're committed to providing an inclusive and welcoming interview experience for all candidates. If you need additional assistance throughout the process due to a physical or mental condition, or if there's something our team can do to enable a more accessible experience, please notify us in advance.
-
Reliability Engineering Specialist
2 weeks ago
Toronto, Ontario, Canada SGS Full time**Job Title:** Reliability Engineering SpecialistAt SGS, we are seeking a skilled Reliability Engineering Specialist to join our team. This role plays a critical part in ensuring the reliability, supportability, scalability, and performance of our .NET stack applications built with MVC, Angular, and Web API.As a key member of our team, you will partner with...
-
Site Reliability Engineer
4 weeks ago
Old Toronto, Canada Lorien Full timeHybrid - Manchester We are currently working with a leading gambling company dedicated to providing exceptional gaming experiences. They are looking for an experienced Site Reliability Engineer with a strong skill set in system reliability to join its world-class technology team. This role is ideal for someone who has 4+ years of experience within the...
-
Toronto, Ontario, Canada The Engineering Institute of Canada Full timeJob SummaryAs a Senior Technical Specialist, Equipment Reliability, you will play a key role in developing and maintaining a deep technical understanding of our insured's businesses to enable world-class insurance engineering services. Your expertise in rotating equipment, specifically prime movers for power generation, will be highly...
-
Site Reliability Engineer
1 month ago
Old Toronto, Canada TD Bank Full timeSite Reliability Engineer Site Reliability Engineer Work Location: Canada Hours: 37.5 Line of Business: Technology Solutions Pay Details: We’re committed to providing fair and equitable compensation to all our colleagues. As a candidate, we encourage you to have an open dialogue with a member of
-
Asset Reliability Engineer
2 months ago
Old Toronto, Canada Chelsea Avondale Full timeChelsea Avondale is the world’s most cutting-edge home insurance group. We have developed sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company. Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our group...
-
Reliability Engineering Specialist
2 weeks ago
Toronto, Ontario, Canada Criteo Full timeAbout the Role:This is a challenging opportunity for an experienced engineer to join Criteo's PRE team as a Site Reliability Engineer. The role involves working closely with product engineering to improve the reliability of our apps, systems, and pipelines, assessing where optimization is needed most, and telling stories with meaningful monitoring.Key...
-
AWS Site Reliability Engineer
2 months ago
Old Toronto, Canada TD Bank Full timediv>Site Reliability EngineerSite Reliability EngineerWork Location: CanadaHours: 37.5Line of Business: Technology SolutionsPay Details: We’re committed to providing fair and equitable compensation to all our colleagues. p>Job Description:CUSTOMERProvide technical leadership to improve the design and operation of systems in alignment to reliability...
-
Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Street Context Full timeAre you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale ? Do you command with a steady hand when incidents unfold? Are you motivated by team success ? If so, continue reading… We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street...
-
Cloud Reliability Engineer
1 month ago
Old Toronto, Canada Ascend Fundraising Solutions Full timeWe are seeking a skilled Cloud Reliability Engineer to collaborate with our IT team in Toronto. In this role, you will work closely with the client services team to diagnose, troubleshoot, and resolve system reliability issues.Responsibilities:Take ownership of customer-reported issues and drive them to resolution.Develop proactive measures to prevent...
-
AWS Site Reliability Engineer
4 weeks ago
Old Toronto, Canada Lorien Full timep>Hybrid - ManchesterWe are currently working with a leading gambling company dedicated to providing exceptional gaming experiences. They are looking for an experienced Site Reliability Engineer with a strong skill set in system reliability to join its world-class technology team. This role is ideal for someone who has 4+ years of experience within the...
-
Site Reliability Engineer
4 weeks ago
Old Toronto, Canada CentML Full timeAt CentML, we are seeking a talented Site Reliability Engineer - Automation to join our team.We have a strong founding team that includes experts in AI, compilers, and ML hardware. Our co-founder and CEO, Gennady Pekhimenko, is a world-renowned expert in ML systems who has received multiple academic and industry research awards from top tech companies.As a...
-
Hardware Design Reliability Engineer
1 month ago
Old Toronto, Canada Aversan Inc Full timeHardware Design Reliability Engineer North York, Ontario Position Summary Responsible for the hardware reliability activities regarding the hardware products within Engineering perimeter. Essential Functions / Key Areas of Responsibility Monitor the hardware reliability of the hardware systems in the field. Maintain a table with all the hardware returns...
-
Site Reliability Engineer
1 month ago
Old Toronto, Canada Sentry Full timeAbout the role The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance, and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1,000,000+ developers.
-
AWS Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Street Context Full timep>Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale? p>We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street Context. We provide a premium Email, Analytics and Broker Relationship platform, purpose-built for capital markets and...
-
Cloud Reliability Engineer Lead
2 weeks ago
Old Toronto, Canada The Home Depot Canada Full timeAbout The JobAs a Cloud Reliability Engineer Lead at The Home Depot Canada, you will play a crucial role in ensuring the reliability, performance, and operational support of our eCommerce systems.Job OverviewThis position requires a strong background in reliability reviews, performance engineering practices, production engineering, and operational support,...
-
AWS Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Street Context Full timep>Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale? p>We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street Context. We provide a premium Email, Analytics and Broker Relationship platform, purpose-built for capital markets and...
-
AWS Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Soda Full timeJob Description Job Title: Site Reliability Engineer Location: Poland - Fully Remote Salary: 324K PLN or 27.3K monthly Start: ASAP Stack: AWS, Docker, Kubernetes, Terraform, Jenkins, Ansible, Linux, JavaScript, and Lambda. Are you a seasoned DevOps/SRE professional passionate about building high-performance, scalable systems? I am working with a Media/IT...
-
AWS Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Olx Full timep>Site Reliability EngineerRemote Poland, PolandOLX – Engineering / Full-time / Remote At OLX, we work together to build a more sustainable world through trade. We make it safe, smart, and convenient to buy and sell cars, find housing, get jobs, buy and sell household goods, and more. Our colleagues around the world help to serve millions of people around...
-
AWS Site Reliability Engineer
2 months ago
Old Toronto, Canada Sentry Full timep>The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance, and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1,000,000+ developers. Sentry receives over a billion events a day and processes terabytes of...
-
Reliability Engineering Specialist
3 weeks ago
Toronto, Ontario, Canada Riverside Natural Foods Full timeCompany OverviewRiverside Natural Foods is a forward-thinking company that prioritizes innovation, sustainability, and employee well-being. Our mission is to create delicious, nutritious snacks that are good for our customers, the planet, and our employees.Salary and BenefitsWe offer a competitive salary range of $55,000 - $65,000 per year, depending on...