Site Reliability Engineer
6 days ago
SITE RELIABILITY ENGINEER
ABOUT THE ROLE
As a Site Reliability Engineer (SRE), you will ensure the reliability, scalability, and performance of our product. You will lead efforts in proactive monitoring, incident management, automation, collaborating across teams to implement best practices in reliability engineering. Your expertise will drive resilient infrastructure, minimize downtime, and enhance operational efficiency to support business goals.
ABOUT POINTSBET
We provide an opportunity for our people—our most powerful and irreplaceable resource—to work in a rewarding, fun, challenging environment and an instrument for personal and professional growth.
PointsBet is a sports & casino betting operator that sits in the very rare position of owning and controlling its technology end to end. Our proprietary platform and our commitment to an in-house approach powers unrivalled innovation and personalized experiences that our customers cannot get anywhere else.
WHAT YOU WILL OWN
Monitoring and Alerting:
- Implement intelligent monitoring solutions.
- Proactively monitor system health and performance metrics using dashboards and alerts.
- Define and refine SLIs, SLOs, and error budgets.
Incident Management:
- Analysing systems health and performance, identifying and mitigating incidents.
- Participate in on-call rotations and respond to incidents quickly and effectively.
- Serve as an incident commander, coordinating communication, mitigation, and resolution of incidents.
- Conduct post-incident reviews (PIRs) and drive follow-up actions to prevent recurrence.
Platform Resiliency:
- Design, implement, and manage resilient systems and architectures.
- Conduct regular performance and reliability reviews, identifying areas for improvement.
Automation:
- Reduce manual operational tasks by building automation solutions using Shell scripting, PowerShell or Python.
- Developing solutions for scaling, monitoring/alerting, auto-healing and
- automation, improving infrastructure scalability, reliability.
- Automate incident response procedures to improve efficiency and reduce manual intervention.
Collaboration:
- Work closely with development, product, and operations teams to align on objectives and drive reliability goals.
- Share technical knowledge across teams via extensive documentation.
Cost Optimization:
- Analyze cloud usage and identify opportunities for cost reduction.
- Implement best practices for efficient use of resources while maintaining system performance.
SKILLS WE SEEK
- Bachelor’s degree in computer science engineering, or a related field.
- 2+ years of experience as a Site Reliability Engineer or similar role.
- Experience with cloud platforms and services (Azure, AWS, or GCP).
- Familiarity with container orchestration platforms (e.g., Kubernetes, Docker Swarm).
- Experience with monitoring / observability tools (e.g., Datadog, Prometheus, Grafana, Azure Monitor, Dynatrace).
- Experience working on Log analysis tools like Splunk or Azure Log analytics using Kusto queries.
- Experience with Infrastructure as Code (IaC) tools like Terraform, ARM etc.
- Experience with provisioning or configuration management tools like Ansible.
- Experience working with SQL, PostgreSQL etc.
- Proficiency in scripting languages (Shell, PowerShell, Python, Go) for automation.
- Message Brokers: Azure Service Bus, RabbitMQ
- Experience with Git.
- Incident Management tools: Pagerduty, OpsGenie or Jira etc.
- Experience with Work management tools like Jira, Azure DevOps.
- Familiarity with cloud cost management tools and practices.
- Proficient in SRE best practices, including defining and maintaining SLIs, SLOs, and error budgets.
- Understanding of DevOps principles and practices.
- Experience working with Software Frameworks: .NET, .NET Core improving resiliency and observability.
PERKS AND BENEFITS
- Hybrid work arrangements
- Fun downtown office on Queen St. West
- Sabbatical Leave
- Pet-friendly office
- No meetings on Fridays
- Paid volunteer days
- Generous Vacation Time & Personal Days
- Generous parental leave policy
- Holiday shutdown
- Group Retirement Savings Plan (with Employer matching)
- Culture Events (sporting events, concerts, happy hours, holiday parties, team outings)
- Incredible culture fostered by a highly collaborative and high-performing team
- Professional development opportunities, working closely with the senior leadership team
IMPORTANT INFORMATION
PointsBet Canada is dedicated to a high-performance culture and ensuring our employees are set up to deliver their best. We offer a fun, dynamic work environment where emphasis is placed on our most important asset: our people. If you are driven and searching for a new opportunity that values people, creativity, opportunity, results, and a commitment to excellence, then this is where you want to be
PointsBet Canada views responsible gambling as an ethical responsibility and an important part of a sustainable business model. We’re proud to be recognized as a socially responsible operator committed to integrating responsible gambling resources and tools throughout the entire player journey.
PointsBet Canada is an equal opportunity employer, committed to inclusion and diversity. PointsBet Canada does not discriminate based on race, national or ethnic origin, colour, religion, age, sex, sexual orientation, gender identity or expression, marital status, family status, genetic characteristics, disability or any other basis forbidden under federal, provincial, or local law.
Accommodations are available on request for candidates taking part in all aspects of the recruitment process. If you require accommodation, or require recruitment documents in an alternative format, please contact us at can-hr [at] pointsbet [dot] com.
-
Site Reliability Engineering Leader
6 days ago
Toronto, Ontario, Canada Royal Bank of Canada Full timeRoyal Bank of Canada is seeking a highly skilled Site Reliability Engineering (SRE) leader to join our team in Toronto, Canada. As an SRE leader, you will be responsible for leading the development and implementation of SRE solutions that improve the reliability and performance of our applications.The ideal candidate will have 5+ years of experience as a...
-
Site Reliability Engineering Lead
3 weeks ago
Toronto, Ontario, Canada Compunnel Inc. Full timeCompunnel Inc. is a leading provider of innovative technology solutions.We are seeking an experienced Site Reliability Engineering Lead to join our team in Toronto, Canada.The estimated salary for this position is $170,000 per year, considering the location and industry standards.About the JobThis role is perfect for someone who is passionate about driving...
-
Reliability Engineer
6 days ago
Toronto, Ontario, Ontario, Canada Major Recruitment Full timeReliability Engineer***Must be Canadian Citizen or Permanent Resident requiring no sponsorship***My Client have a shared vision for greatness. We manufacture some of North America’s most popular tissue brands - Cashmere®, Purex®, Scotties®, SpongeTowels®, Bonterra®, White Cloud®, as well as products for use away from home.We are leaders in our...
-
Staff Site Reliability Engineer
1 month ago
Toronto, Ontario, Canada Index Exchange Full timeAbout the Role:We are seeking a highly skilled Staff Site Reliability Engineer to own and develop on-premise and hybrid cloud environments, focusing on low-latency performance on Kubernetes platforms supporting a robust developer experience framework.The ideal candidate will have a deep technical understanding of on-premise and hybrid cloud architectures and...
-
Chief Site Reliability Engineer
3 weeks ago
Toronto, Ontario, Canada Index Exchange Full timeAbout Index ExchangeWe have a rich history of shaping the earliest forms of ad tech, and we're now looking for talented engineers to help drive its future. Our customers face unique challenges that require technical expertise at internet scale.Our infrastructure handles over 450 billion requests daily, all running in our own global data centers. We provide...
-
Cloud Native Site Reliability Engineer
3 weeks ago
Toronto, Ontario, Canada Thomson Reuters Full timeWe are seeking an experienced Senior SRE to join our Shared Capabilities, Service Reliability and Operation team in Toronto. As a Cloud Native Site Reliability Engineer, you will be responsible for implementing site reliability engineering and DevOps best practices, building and maintaining monitoring for all aspects of infrastructure, micro-services, usage...
-
Site Reliability Team Lead
4 weeks ago
Toronto, Ontario, Canada Sentry Full timeAbout SentryWe're on a mission to help developers write better software faster, so we can get back to enjoying technology. With more than $217 million in funding and 100,000+ organizations that believe we're on to something, we're building performance and error monitoring tools that help companies like Disney, Microsoft, and Atlassian spend less time fixing...
-
Highly Skilled Site Reliability Engineer
5 days ago
Toronto, Ontario, Canada Compunnel Inc. Full timeAt Compunnel Inc., we are looking for a talented Senior Site Reliability Engineer/DevOps to join our team. This is a challenging opportunity to work with the latest tools and technologies to drive forward Automation, Observability and CI/CD automation.The ideal candidate is passionate about driving SRE DevSecOps mindset and culture in a fast-paced...
-
Site Reliability Specialist
4 weeks ago
Toronto, Ontario, Canada Royal Bank of Canada> Full timeJob SummaryWe are seeking a talented Site Reliability Engineer to join our Digital team at Royal Bank of Canada. This is an exciting opportunity to accelerate our cloud native initiatives and make a difference in the industry.Job DescriptionWe are looking for an individual who embodies leadership, mentorship, and decision-making qualities. As a Site...
-
Reliability Engineering Specialist
3 weeks ago
Toronto, Ontario, Canada Criteo Full timeAbout the Role:This is a challenging opportunity for an experienced engineer to join Criteo's PRE team as a Site Reliability Engineer. The role involves working closely with product engineering to improve the reliability of our apps, systems, and pipelines, assessing where optimization is needed most, and telling stories with meaningful monitoring.Key...
-
Global Site Reliability Engineer
3 weeks ago
Toronto, Ontario, Canada mccainfood Full timeJob SummaryWe are seeking a highly skilled Global Site Reliability Engineer to join our team. As a key member of our organization, you will be responsible for ensuring the reliability, performance, and scalability of our global communication services.
-
Reliability Engineering Lead
1 day ago
Toronto, Ontario, Canada Royal Bank of Canada Full timeJob SummaryRoyal Bank of Canada is seeking an experienced professional to lead our Site Reliability Engineering (SRE) efforts for our US Cash Management Technology. This is a unique opportunity to shape the future technology landscape of the company, delivering key business values and implementing strategic components across all RBC functions defined in our...
-
Reliability Engineering Manager
4 weeks ago
Toronto, Ontario, Canada Estée Lauder Companies Full timeReliability Engineering Manager RoleWe are seeking a highly skilled Reliability Engineering Manager to join our team at Estée Lauder Companies. As a key member of the Plant Management Team, you will be responsible for leading maintenance and reliability processes to achieve operational excellence.The ideal candidate will have a strong background in plant...
-
Reliability Engineering Expert
4 weeks ago
Toronto, Ontario, Canada Criteo Full timeAbout the Role:Criteo is seeking a talented Site Reliability Engineer to join our PRE team.What You'll Do: As a Site Reliability Engineer, you'll work closely with product engineering to improve the reliability of our apps, systems, and pipelines. You'll assess where optimization is needed most and tell stories with meaningful monitoring.How You'll Make an...
-
Site Reliability Engineer
3 weeks ago
Toronto, Ontario, Canada Tecsys Inc. Full timeAbout the RoleWe are looking for an exceptional Site Reliability Engineer to join our Network and Security Operations Center team. As a key member of our team, you will be responsible for ensuring the reliability and uptime of our platform and applications.Key Responsibilities:Collaborate with Engineering teams to support services through system design...
-
Senior Site Reliability Engineering Manager
3 weeks ago
Toronto, Ontario, Canada Index Exchange Full timeAbout Index ExchangeWe are shaping the future of ad tech and seeking an experienced Senior Site Reliability Engineering Manager to lead our SRE team.As a key member of our technical leadership, you will be responsible for building and managing a high-performing SRE team, fostering a culture of innovation, collaboration, and accountability. You will provide...
-
Site Reliability Engineer/DevOps Expert
6 days ago
Toronto, Ontario, Canada Compunnel Inc. Full timeAbout Compunnel Inc.Compunnel Inc. is a fast-paced and dynamic company seeking a skilled Site Reliability Engineer/DevOps Expert to join our team.Job DescriptionWe are looking for an experienced professional who can drive the SRE DevSecOps mindset and culture in our organization. The ideal candidate will have a strong passion for driving automation,...
-
Site Reliability Engineer
1 month ago
Toronto, Ontario, Canada SGS Full timeJob SummaryThe Site Reliability Engineer will play a critical part in ensuring the reliability, supportability, scalability, and performance of our .NET stack applications built with ASP.NET MVC, Angular, and Web API.Key ResponsibilitiesPartner with developers and product operations teams to understand application requirements and translate them into...
-
Site Reliability Engineer for Long-Term Project
4 weeks ago
Toronto, Ontario, Canada Lorven Technologies Full timeWe are seeking a skilled Site Reliability Engineer to support our long-term project in a hybrid environment. The successful candidate will have strong expertise in Azure and OpenShift, as well as experience with Dynatrace/ELK/Splunk for monitoring and observability.Key Responsibilities:Develop SRE solutions (monitoring and alerting, machine learning anomaly...
-
Reliability Systems Engineer
4 days ago
Toronto, Ontario, Canada Teranet Inc. Full timeAbout TeranetTeranet is a leading innovator in electronic services and solutions, operating one of the most advanced and secure registration systems worldwide.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our DevOps team. The ideal candidate will possess strong software engineering principles and infrastructure expertise to...