Sr./Staff - Infrastructure/Site Reliability Engineer (SRE)

1 week ago


Canada Oscilar Full time

Overview Join to apply for the DevOps/Site Reliability Engineer (SRE) role at Oscilar . Get AI-powered advice on this job and more exclusive features. Shape the future of trust in the age of AI At Oscilar, we're building the most advanced AI Risk Decisioning Platform. Banks, fintechs, and digitally native organizations rely on us to manage their fraud, credit, and compliance risk with the power of AI. If you're passionate about solving complex problems and making the internet safer for everyone, this is your place. Why Join Us Mission-driven teams: Work alongside industry veterans from Meta, Uber, Citi, and Confluent, all united by a shared goal to make the digital world safer. Ownership and impact: We believe in extreme ownership. You'll be empowered to take responsibility, move fast, and make decisions that drive our mission forward. Innovate at the cutting edge: Your work will shape how modern finance detects fraud and manages risk. About The Role Oscilar is growing fast, and so is the complexity of our systems. We’re looking for an experienced SRE to take ownership of reliability across our multi-region, cloud-native platform. You’ll have the mandate and autonomy to design, implement, and evolve systems that stay performant and resilient—through traffic spikes, dependency failures, and global deployments. You’ll be shaping how we scale, how we build observability, and how we run infrastructure that supports billions of events and large-scale data pipelines. What You’ll Own Architect and operate resilient cloud infrastructure (AWS, Pulumi, Kubernetes). Lead initiatives to improve availability, latency, and performance at scale. Design and evolve our CI/CD pipelines to optimize for speed, safety, and repeatability. Define the metrics, alerts, and runbooks that form our observability backbone. Run chaos experiments and failure simulations to harden the platform. Mentor engineers and set best practices for SRE across the company. What You Bring Proven track record as a senior SRE, DevOps, or infrastructure engineer in high-scale environments. Expert-level skills in AWS and Infrastructure as Code (Pulumi, Terraform). Strong programming ability in Go and Java. Deep understanding of distributed systems (Kafka, ClickHouse) and microservices architecture. Mastery of container orchestration (Kubernetes) and production debugging. Strong sense of ownership, and the judgment to balance velocity with reliability. Benefits Compensation: Competitive salary and equity packages, including a 401k plan Flexibility: Remote-first culture — work from anywhere Health: 100% Employer covered comprehensive health, dental, and vision insurance with a top tier plan for you and your dependents (US and Canada) Balance: Unlimited PTO policy Technical: AI First company; both Co-Founders are engineers at heart; and over 50% of the company is Engineering and Product Culture: Family-Friendly environment; Regular team events and offsites Development: Unparalleled learning and professional development opportunities Gear: Home office setup assistance Impact: Making the internet safer by protecting online transactions Seniority level Mid-Senior level Employment type Full-time Job function Engineering and Information Technology Industries Technology, Information and Internet Referrals increase your chances of interviewing at Oscilar by 2x #J-18808-Ljbffr



  • , , Canada Remoteworldwide Full time

    Staff Infrastructure Site Reliability Engineer Staff Infrastructure Site Reliability Engineer Posted: 04/05/2025 Anywhere in the world Remote Senior About the Team: Netlify’s SRE team is scaling to meet the demands of our rapidly growing platform and user base. Our SRE team is responsible for ensuring the reliability, scalability, and efficiency of...


  • (s): Canada : Ontario : Toronto Scotiabank Global Site Full time

    Requisition ID: 244026Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.Overview: As a Site Reliability Engineer (SRE), you will join the Digital Engineering Operations team, responsible for ensuring the operations and reliability of Scotiabank digital applications. You will have the opportunity to drive...


  • Calgary, Alberta, TP Z, Canada Rogii Full time

    About ROGIIGet ready to buckle up and meet the powerhouse that's revolutionizing the Oil & Gas industry – ROGIIWe're a dynamic team of tech enthusiasts who are not afraid to take risks and bring innovation to the forefront with our comprehensive software solutions. Here at ROGII, we're all about optimizing well operations and streamlining workflows for...


  • , , Canada Thumbtack Full time

    Sr. Software Engineer, Site Reliability Engineer Thumbtack helps millions of people confidently care for their homes. Thumbtack is the one app you need to take care of and improve your home — from personalized guidance to AI tools and a best-in-class hiring experience. Every day in every county of the U.S., people turn to Thumbtack to complete urgent...


  • Canada Dayforce Full time

    About the OpportunityAs a Site Reliability Engineer at Dayforce, you will be part of a pioneering team responsible for ensuring our industry-leading HCM platform delivers exceptional scalability, availability, and reliability. Dayforce is a global HCM technology company with operations across North America, EMEA, and APJ, and our award-winning cloud platform...


  • , MB, Canada MongoDB Full time

    Site Reliability Engineer (Senior or Staff), Fabric Join to apply for the Site Reliability Engineer (Senior or Staff), Fabric role at MongoDB . The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions that support the broader engineering organization. Among these are our...


  • McCain Foods (Canada) McCain Foods Full time

    Position Title: Site Reliability Engineer Position Type: Regular - Full-Time ​Position Location: Toronto HQ Requisition ID: 36904 Our Global Technology team's goal is to leverage technology and data to drive profitable growth, focus on enhancing customer experience and to further our purpose of 'Celebrating real connections through delicious,...


  • , , Canada Command Alkon Incorporated. Full time

    Title: Manager, Site Reliability Engineer (SRE) Summary of Role The Site Reliability Engineer (SRE) Manager leads the teams responsible for ensuring the availability, performance, and reliability of mission‑critical systems. This role bridges the gap between software engineering and operations by implementing automation, observability, and scalability...


  • , BC, Canada Orion Innovation Full time

    Overview Senior Site Reliability Engineer (SRE) with Kubernetes and Rancher. Full-time role focused on building and maintaining highly resilient, secure systems, including in air-gapped environments. Responsibilities System Architecture & Management: Design, architect, and maintain highly reliable, multi-tenant systems using Kubernetes and related tools...


  • , , Canada Wealthsimple Full time

    Your career is an investment that grows over time! Wealthsimple is on a mission to help everyone achieve financial freedom by reimagining what it means to manage your money. Using smart technology, we take financial services that are often confusing, opaque and expensive and make them transparent and low-cost for everyone. We’re the largest fintech company...