Senior Site Reliability Engineer, Data
1 week ago
Focal Systems is the industry leader in retail AI solutions. We are a Silicon Valley based startup that has more than doubled in size every year since inception. Our mission is to automate and optimize brick and mortar retail using deep learning computer vision. We are looking for smart, creative and passionate people who want to help build a great and enduring company and deploy Deep Learning to the world
Work with Backend, Frontend and Deep Learning teams and write infrastructure automation code for their needs.Identify scalability bottlenecks through load testing and plan infrastructure architecture.Create tools to provide transparency/ease of access into the company's rich datasets stored across varying geographic locations and data formats.Design, build, and manage a robust Continuous Integration and Continuous Deployment (CI/CD) pipeline.Requirements
- Solid experience in an infrastructure or Site Reliability Engineer (SRE) role.
- Great understanding of SQL, networking, distributed systems, operating systems (debian) and software engineering practices.
- Terraform or other Infrastructure as Code automation solution.
- Operating Relational SQL databases and Redis at terabyte scale.
- Proven experience with setting up monitoring/alerting and reliability engineering.
- Scripting skills in Python.
- Must be comfortable with 12-hour on-call rotations.
- Setting up automation for complex load testing scenarios.
- Tuning Deep Learning pipelines with Python, Pytorch and Multiprocessing.
- Backend programming with Python./>
Exceptional Team - We are a team of hard-working, fun-loving professionals from some of the most eminent universities, research labs, and tech companies of our time. We pride ourselves on recruiting exceptional individuals to help us redefine the state-of-the-art.
Outstanding Partners - We work with 10+ of the largest retailers in the world and have a world-class roster of investors, advisors and partners to support & advise us in our endeavors.We care deeply about the health, happiness, and wellbeing of all of our employees.
-
Senior Site Reliability Engineer
4 months ago
Toronto, Canada Northbridge Financial Corporation Full timeWhat is it like to be a Senior Site Reliability Engineer at Northbridge Financial The Senior Site Reliability Engineer oversees the creation and implementation of Service Level Objectives (SLOs). The Senior SRE handles service reliability solutions and processes of increasing complexity, and are responsible for mentoring and leading less experienced...
-
Site Reliability Engineer
7 months ago
Toronto, Canada CB Canada Full timeSite Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...
-
Senior Site Reliability Engineer
2 weeks ago
Old Toronto, Canada RBC Full timeAbout the RoleWe are seeking an experienced Senior Site Reliability Engineer to join our US Cash Management Technology team at RBC. As a key member of our team, you will be responsible for leading the development, implementation, and support of Site Reliability Engineering (SRE) solutions for applications supported by the Commercial, Core Banking, and...
-
Site Reliability Engineering Manager
1 week ago
Old Toronto, Canada Tbwa ChiatDay Inc Full timeAutomate and Optimize Brick and Mortar RetailFocal Systems is the industry leader in retail AI solutions, revolutionizing brick and mortar retail with deep learning computer vision. As a Silicon Valley-based startup, we have more than doubled in size every year since inception.Our MissionWe are looking for smart, creative, and passionate individuals who want...
-
AWS Site Reliability Engineer
2 months ago
Old Toronto, Canada Sentry Full timep>The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance, and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1,000,000+ developers. Sentry receives over a billion events a day and processes terabytes of...
-
AWS Site Reliability Engineer
1 month ago
Old Toronto, Canada Soda Full timeJob Description Job Title: Site Reliability Engineer Location: Poland - Fully Remote Salary: 324K PLN or 27.3K monthly Start: ASAP Stack: AWS, Docker, Kubernetes, Terraform, Jenkins, Ansible, Linux, JavaScript, and Lambda. Are you a seasoned DevOps/SRE professional passionate about building high-performance, scalable systems? I am working with a Media/IT...
-
Site Reliability Engineering Linux or Windows
2 months ago
Old Toronto, Canada Thomson Reuters Full timeh3>(Canada) Site Reliability Engineer (Contract)Contract (9 months 4 days)Published 3 days agoNew RelicData DogSite Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will analyze chronic...
-
Senior Site Reliability Engineer
2 weeks ago
Old Toronto, Canada Loblaw Companies Ltd - Head Office Full timep>Toutes les références de candidats doivent d’abord être soumises dans Workday par un collègue de Loblaw actuel.Venez faire votre différence dans les communautés à travers le Canada, où l'authenticité, la confiance et l'établissement de liens sont valorisés - alors que nous façonnons l'avenir du commerce de détail au Canada, ensemble....
-
Senior Site Reliability Engineer
3 months ago
Toronto, Canada Vantage Full timeSenior Site Reliability Engineer / DevOps Engineer Are you passionate about ensuring the seamless operation of large-scale, distributed, and robust systems? Do you thrive on optimizing performance, increasing reliability, and automating tasks to create more efficient processes? Are you hungry for learning? If so, we would want to chat to you! As a...
-
Senior Site Reliability Engineer
6 months ago
Greater Toronto Area, Canada GlossGenius Full timeAbout GlossGenius GlossGenius is building an ecosystem enabling entrepreneurs to succeed. We empower small business owners to focus on being creators, not admins, by offering a range of business management tools including booking and scheduling, marketing, analytics, payment processing and much more. Over 75,000 small business owners have chosen to...
-
Site Reliability Engineering Lead
2 weeks ago
Old Toronto, Canada TD Full timeJob OverviewWe are seeking a highly skilled Site Reliability Engineering Lead to join our team at TD. As a key member of our technology group, you will be responsible for ensuring the stability, scalability, and reliability of our platforms.About the RoleThe ideal candidate will have a minimum of 8 years of experience in site reliability engineering, with a...
-
Senior Site Reliability Engineer
7 months ago
Toronto, Canada Criteo Full timeWhat You'll Do:What’s a PRE Team?The concept of Product Reliability Engineering (PRE) was born from an industry leading online SRE book (go ahead, “Google” it). At Criteo, we are the bridge between Product and Platform Engineering. The PRE group is composed of 7 teams of people with a wide variety of backgrounds, experiences and perspectives. How...
-
AWS Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Tecsys Inc. Full timep>Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...
-
Senior Cloud Reliability Architect
3 weeks ago
Old Toronto, Canada Northbridge Financial Corporation Full timeAbout the RoleAt Northbridge Financial Corporation, we are seeking a highly skilled Senior Cloud Reliability Architect to oversee the creation and implementation of Service Level Objectives (SLOs). This senior role involves handling complex service reliability solutions and is responsible for mentoring and leading less experienced engineers.We Want Your...
-
AWS Site Reliability Engineer
1 month ago
Old Toronto, Canada Street Context Full timep>Are you a Site Reliability Engineer that has a passion for building reliable, resilient and performant systems that scale? p>We are on a mission to build and strengthen our engineering teams to match the accelerating success of Street Context. We provide a premium Email, Analytics and Broker Relationship platform, purpose-built for capital markets and...
-
Senior Site Reliability Engineer
5 months ago
Toronto, Canada Thomson Reuters Full timeDescription Thomson Reuters is seeking a Senior Site Reliability Engineer to join our Service Management, Technology team. This role calls for an individual who is capable of analyzing customer problems of high complexity and assessing the scope of impact, while mitigating customer impact of issues and executing work arounds. Willingness to learn is...
-
Senior Site Reliability Engineer/DevOps
3 weeks ago
Toronto, ON, Canada PointsBet Canada Full timeSITE RELIABILITY ENGINEER As a Site Reliability Engineer (SRE) , you will ensure the reliability, scalability, and performance of our product. You will lead efforts in proactive monitoring, incident management, automation, collaborating across teams to implement best practices in reliability engineering. PointsBet is a sports & casino betting operator...
-
AWS Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Tecsys Full timeTecsys is a fast-growing innovator offering supply chain solutions to industry-leading healthcare systems, hospitals, and pharmacy businesses to distributors, retailers, and 3PLs. As a Cloud Infrastructure Specialist, you will be responsible for ensuring the reliability and uptime of our platform and applications in a data-driven way to support internal and...
-
AWS Site Reliability Engineer
2 months ago
Old Toronto, Canada Sentry Full timeBad software is everywhere, and we’re tired of it. Sentry is on a mission to help developers write better software faster, so we can get back to enjoying technology.With more than $217 million in funding and 100,000+ organizations that believe we’re on to something, we're building performance and error monitoring tools that help companies like Disney,...
-
AWS Site Reliability Engineer
1 month ago
Old Toronto, Canada Olx Full timep>Site Reliability EngineerRemote Poland, PolandOLX – Engineering / Full-time / Remote At OLX, we work together to build a more sustainable world through trade. We make it safe, smart, and convenient to buy and sell cars, find housing, get jobs, buy and sell household goods, and more. Our colleagues around the world help to serve millions of people around...