Cloud Site Reliability Engineer

2 weeks ago


Canada Smile Digital Health Full time

Job Title Cloud Site Reliability Engineer Base Pay Range CA$100,000.00/yr - CA$120,000.00/yr Company Overview Smile Digital Health is a global health data platform that empowers healthcare stakeholders to collect and exchange data through a leading FHIR-based data liberation platform. The company has been recognized on Deloitte’s Technology Fast 50 Ranking for 2024 and is committed to the #BetterGlobalHealth mission. Responsibilities Collaborate with Security Operations teams to define and implement best practices around Cloud Service Provider configuration for Azure and other cloud providers. Develop, implement, and coordinate a multi-tenant approach around service offerings for DB, Container platform, Authentication, Certificates, and Product Registries. Design and maintain performance testing strategies, framework, and environments in the cloud. Develop and maintain cost/utilization tracking and attribution processes for all Cloud Service Providers. Create documentation around Cloud Service Provider offerings detailing use cases, best practices, and implementation details. Develop and maintain technical relationships with core Cloud Service Providers. Implement and maintain a secure and scalable infrastructure platform for delivering Cloud Services applications. Ensure internal and external SLAs meet and exceed expectations, continuously monitor and improve system-centric KPIs. Create tools for automating deployment, monitoring, and operations of the overall platform. Participate in an on-call rotation to provide application support, incident management, and troubleshooting. Provide ongoing maintenance and support of internal tools, improving system health and reliability. Assist customers with the on-site deployments when needed. Implement and manage observability tools (logging, metrics, tracing) for performance insights, using the OTEL and Grafana stack preferred. Comply with organizational security policies and procedures. Accurately report working hours in the Time Tracking System on a daily or weekly basis, ensuring the majority of hours are tracked as billable. Adhere to privacy, security, and confidentiality policies, holding all confidential information in strict confidence. Requirements Demonstrated expertise of cloud service providers and best practices around implementation and configuration, preferably managing Azure on behalf of multiple teams for a company that delivers SaaS products. Experience with Kubernetes, OpenShift, Kafka, Elastic stack, and strong focus on Java-based microservices architecture. Experience applying chaos engineering practices to evaluate and enhance system resiliency. Skilled in troubleshooting performance issues, including analyzing time consumption, allocating resources, and recommending optimizations. Familiar with performance testing methodologies and tools to assess system behavior under load. Proven experience with Security and Compliance (SOC2, HIPAA, ISO27001) best practices and implementing controls that support high-velocity software delivery teams. Proficiency in Terraform, Ansible, or Chef; expertise in troubleshooting, support escalation, on-call process optimization, and documenting knowledge. Passionate about Infrastructure as Code, automation, and developing solutions that help developers move quickly and safely. Familiarity with infrastructure management and operations lifecycle concepts and ecosystem. Experience operating and maintaining production systems in a Linux and public cloud environment. Prior experience working in high-performance or distributed systems; availability to work across a range of experience levels. Working knowledge of industry best practices regarding information security. Prior experience building or maintaining a large-scale Cloud service. Proven ability to prioritize and track multiple projects in parallel and to be highly responsive and customer-focused. Benefits Remote Work Environment Flexible Time Away From Work Policy including PTO, Personal, and Sick Days Competitive Salary and Health/Medical Benefits RRSP/TFSA/401K Employee Contribution Life and Disability Coverage Employee Assistance Program FHIR Study Program and Skillsoft Learning Super HAPI Fun Club Equal Employment Opportunity We welcome and encourage candidates of all backgrounds to apply. Candidates are encouraged to inform us if they wish to discuss or require accommodations during interviews or while working at Smile. We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us. Seniority Level Not Applicable Employment Type Full-time Job Function Engineering and Information Technology Industries Transportation, Logistics, Supply Chain and Storage Referral Program Referrals increase your chances of interviewing at Smile Digital Health by 2x Location Greater Montreal Metropolitan Area #J-18808-Ljbffr



  • , , Canada Smile Digital Health Full time

    Working for a company like Smile Digital Health means supporting our mandate for #BetterGlobalHealth . We strive towards this goal every day, and the results can be seen in the impact of our innovative health data platform and data management solutions, which are used in over 20 countries. We were #19 on Deloitte's Technology Fast 50 Ranking for 2024! Smile...


  • (s): Canada : Ontario : Toronto Scotiabank Global Site Full time $105,000 - $170,000 per year

    Requisition ID: 244026Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.Overview: As a Site Reliability Engineer (SRE), you will join the Digital Engineering Operations team, responsible for ensuring the operations and reliability of Scotiabank digital applications. You will have the opportunity to drive...


  • (s): Canada : Ontario : Toronto Scotiabank Global Site Full time US$80,000 - US$140,000 per year

    Requisition ID: 244027Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.Overview: As a Site Reliability Engineer (SRE), you will join the Digital Engineering Operations team, responsible for ensuring the operations and reliability of Scotiabank digital applications. You will have the opportunity to drive...


  • Toronto, Ontario / Remote, Canada Smile Digital Health Full time US$1,000,000 - US$1,440,000 per year

    Working for a company like Smile Digital Health means supporting our mandate for #BetterGlobalHealth. We strive towards this goal every day, and the results can be seen in the impact of our innovative health data platform and data management solutions, which are used in over 20 countries. We were #19 on Deloitte's Technology Fast 50 Ranking for Smile...


  • , , Canada Icon Full time

    Helping SaaS companies scale Engineering teams. Director, Site Reliability Engineering We are seeking an accomplished Director of Site Reliability Engineering (SRE) to lead the reliability, scalability, and performance initiatives across multiple enterprise technology domains, including AML, Risk, Finance, Corporate Treasury, and Human Resources systems....


  • , , Canada Compass Digital Full time

    As a Reliability Engineer you will work in focus areas such as observability, release automation, incident and problem response improvements, security, code quality, patch management and SRE advocacy. You will have the opportunity to use the latest and greatest cloud and open-source technology to enable our product and test engineering teams through...


  • , , Canada CENGN - (Centre of Excellence in Next Generation Networks) Full time

    Join to apply for the Site Reliability Engineer role at CENGN - (Centre of Excellence in Next Generation Networks) Use of AI in Hiring: No, we do not use AI in screening/selection. Vacancy Status: This posting is for an existing vacancy. We are hiring for 1 position. About Us CENGN is Canada’s Centre of Excellence in Next Generation Networks. Our mission...


  • , , Canada Thinkific Full time

    Join to apply for the Senior Site Reliability Engineer role at Thinkific Join to apply for the Senior Site Reliability Engineer role at Thinkific Are you an experienced Site Reliability Engineer looking for a new challenge? We’re looking for a Senior Site Reliability Engineer to join us at Thinkific. We’re looking for a Senior Site Reliability Engineer...


  • , , Canada Tyk Full time

    About Tyk The Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services. Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across the retail, finance, telecoms, healthcare, or...


  • , , Canada Omniscient Neurotechnology (o8t) Full time

    Get AI-powered advice on this job and more exclusive features. Omniscient Neurotechnology (o8t) provided pay range This range is provided by Omniscient Neurotechnology (o8t). Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range $115,000.00/yr - $120,000.00/yr About Omniscient Omniscient...