Senior Site Reliability Engineer
4 weeks ago
Senior Site Reliability Engineer (SRE) – Azure Certified
Role Summary:
We are looking for a Senior Site Reliability Engineer (SRE) with 8+ years of experience in DevOps, cloud infrastructure, automation, and software development. The ideal candidate should be Azure-certified, proficient in GitHub Actions, Ansible scripting, and Infrastructure as Code (IaC), with a strong background in automation and development.
This role requires deep expertise in building scalable, resilient, and automated cloud environments, ensuring high system reliability, and streamlining CI/CD processes.
Key Responsibilities:
- Design, implement, and maintain highly available, scalable, and fault-tolerant systems.
- Architect, deploy, and optimize cloud infrastructure on Microsoft Azure, ensuring best practices for security, cost management, and performance.
- Build and enhance GitHub Actions workflows to support automated software delivery.
- Develop and maintain Ansible playbooks and Terraform scripts to manage cloud infrastructure and configuration.
- Design and implement automation solutions to reduce manual effort, improve incident response, and enhance system reliability.
- Proactively monitor, troubleshoot, and resolve complex production issues, ensuring minimal downtime.
- Implement and manage logging, monitoring, and alerting using tools like Prometheus, Grafana, Azure Monitor, or Datadog.
- Enforce security best practices across cloud and DevOps workflows, ensuring compliance with industry standards.
- Work closely with development, operations, and security teams while mentoring junior engineers on DevOps and SRE best practices.
Required Qualifications:
- 7+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering.
- Azure Certification (e.g., AZ-104, AZ-400) is mandatory.
- Strong experience with GitHub Actions for CI/CD automation.
- Expertise in Ansible scripting and Infrastructure as Code (IaC).
- Proficiency in Python, Go, or Bash for automation and development.
- Deep understanding of containerization (Docker, Kubernetes) and microservices architectures.
- Experience with monitoring and observability tools such as Prometheus, Grafana, or Azure Monitor.
- Strong problem-solving and troubleshooting skills with a focus on automation and scalability.
Preferred Qualifications:
- Experience with Terraform for cloud provisioning and configuration.
- Knowledge of API management and integrations.
- Experience with incident management and SLO/SLI definitions.
- Understanding of zero-downtime deployment strategies and blue-green deployments.
-
Senior Site Reliability Engineer
1 week ago
Canada Regie Full timeai is a Series B-funded, AI-native sales engagement automation platform focused on transforming business-critical prospecting—the top of the funnel—into a precise, scalable, and repeatable process. As the volume of sales activity required to book a meeting continues to grow exponentially, traditional tools have failed to keep pace—leaving critical...
-
Senior Site Reliability Engineer
1 week ago
Canada Regie Full timeCompany Overview: Regie.ai is a Series B-funded, AI-native sales engagement automation platform focused on transforming business-critical prospecting—the top of the funnel—into a precise, scalable, and repeatable process. As the volume of sales activity required to book a meeting continues to grow exponentially, traditional tools have failed to keep...
-
Site Reliability Systems Engineer
1 week ago
Canada Regie Full timeAbout This RoleWe're seeking an experienced Senior Site Reliability Engineer/DevOps who can design and maintain production-grade infrastructure with high availability and low latency.This role involves extensive hands-on experience with AWS and its core services. You'll be responsible for architecting a unified monitoring and alerting system for engineering...
-
Site Reliability Engineer – Security Focus
2 weeks ago
Canada Telna Full timeSite Reliability Engineer – Security Focus Location: Remote Department: Site Reliability Engineering Type: Full-Time Overview We're looking for a seasoned Site Reliability Engineer (SRE) with a strong background in infrastructure security to join our team. This role is not just about uptime—it's about building secure, robust, and auditable...
-
Site Reliability Engineer – Security Focus
2 weeks ago
Canada Telna Full timeSite Reliability Engineer – Security Focus Location: Remote Department: Site Reliability Engineering Type: Full-Time OverviewWe're looking for a seasoned Site Reliability Engineer (SRE) with a strong background in infrastructure security to join our team. This role is not just about uptime—it's about building secure, robust, and auditable systems across...
-
Site Reliability Leader
2 weeks ago
Canada iVedha Inc. Full timeiVedha Inc. is a leading provider of innovative solutions in cloud computing and DevOps. We are currently seeking an experienced Senior Site Reliability Engineer to join our team.About the Role:This is a unique opportunity to work with a talented team of engineers and contribute to the design and implementation of highly available, scalable, and...
-
Site reliability engineer
4 weeks ago
Canada Luxoft Full timeProject description Do you like to work with existing and new software product development teams? This position is to instrument end-to-end observability and visibility for business-critical systems with log ingestion, metrics, and traces. You will function as a site reliability engineer (SRE) that will collaborate with product teams, infrastructure SMEs,...
-
Site Reliability Engineer
2 weeks ago
Canada Sigmaways Inc Full timeIf you are passionate about reliability, automation, and performance optimization and working in a fast-paced, collaborative environment where innovation is encouraged, this role is for you. We are looking for a Site Reliability Engineer to optimize and maintain our production environment, ensuring a highly available and scalable platform for our...
-
Site Reliability Engineer
2 days ago
Canada Yochana Full timePosition Name – Site Reliability Engineer - LeadType of hiring – Fulltime with TechMLocation – Remote CanadaSeek a hands-on Lead professional with strong technical expertise and leadership capabilities.Job Description:Experience Range: 5 to 10 YearsMandatory Skills: Linux, AWS or GCP, Kubernetes, Shell scripting.- Proven experience in Technical Project...
-
Site Reliability Engineer
5 days ago
Canada SmartSimple Software Full timeAbout SmartSimple & Foundant At SmartSimple and Foundant Technologies, we empower mission-driven organizations to manage their data, workflows, and impact with our comprehensive software solutions. From grant management and community foundations to process automation and data collaboration, our combined expertise supports a diverse range of organizations -...
-
Site Reliability Engineer
5 days ago
Canada SmartSimple Software Full timeAbout SmartSimple & Foundant At SmartSimple and Foundant Technologies, we empower mission-driven organizations to manage their data, workflows, and impact with our comprehensive software solutions. From grant management and community foundations to process automation and data collaboration, our combined expertise supports a diverse range of organizations -...
-
Site Reliability Engineer
5 days ago
Canada SmartSimple Software Full timeAbout SmartSimple & Foundant At SmartSimple and Foundant Technologies, we empower mission-driven organizations to manage their data, workflows, and impact with our comprehensive software solutions. From grant management and community foundations to process automation and data collaboration, our combined expertise supports a diverse range of organizations...
-
Site Reliability Engineer
6 hours ago
Canada Yochana Full timePosition Name – Site Reliability Engineer - Lead Type of hiring – Fulltime with TechM Location – Remote Canada Seek a hands-on Lead professional with strong technical expertise and leadership capabilities. Job Description: Experience Range: 5 to 10 Years Mandatory Skills: Linux, AWS or GCP, Kubernetes, Shell scripting. Proven experience in...
-
Senior Site Reliability Engineer/DevOps
2 weeks ago
Canada National Bank Full timeAs a Specialist in site reliability engineering on the National Bank Data Protection team, you will ensure the operational reliability of data protection assets. With your experience and knowledge in the operational management of high-availability assets (HA), you will have a positive impact on the Bank's stability and reputation with its internal and...
-
Canada MRL Consulting Group - the semiconductor recruitment company Full timeAbout the RoleMRL Consulting Group is partnering with a cutting-edge semiconductor company to find a highly skilled Senior Semiconductor Engineer for Reliability and Quality. As part of our client's core team, you'll play a crucial role in driving product robustness and mitigating risks.The ideal candidate will have a strong understanding of fab and...
-
Azure site reliability engineer
1 week ago
Canada Quantum World Technologies Inc. Full timeJob Description:The role is expected to perform day to day support for the business alongside reliability engineering tasks. The role has an emphasis on improving the reliability of our systems by working with the Software developers and Infrastructure engineering teams to develop automated reliability solutions. Using automation will help evolve our systems...
-
Reliability Engineer
3 weeks ago
Canada TechEra Global Inc Full timeTitle : Reliability Engineer- Engine Programs & AccessoriesLocation: Montreal, Canada (Onsite)Salary: Full TimeMust have Key skill Aero Engine Domain, LRU specialist, Project Management, Aero/Accessories Reliability AnalysisMax 2-5 years of experience in LRU Reliability Analysis, Project Management /stakeholder management.Key words : Component Reliability or...
-
Reliable Cybersecurity Engineer
2 weeks ago
Canada Telna Full timeTelna's mission is to build a secure and reliable infrastructure that supports our growth and innovation. As a seasoned Site Reliability Engineer, you will play a pivotal role in achieving this goal.With a strong background in infrastructure security, you will be responsible for designing, implementing, and maintaining secure systems that meet the evolving...
-
Senior Site Reliability Engineer
1 week ago
Canada Regie Full timeAbout the Founders and Engineering ExecutiveSridhar has a PHD from Carnegie Mellon and also has previously founded companies like Onera. He was among the first 100 engineers at Facebook and has also worked with Google and Bloomreach.He has worked with multiple startups and has helped them build their growth story from the ground up.John Zhang – Head of...
-
Reliability Engineering Manager
5 days ago
Canada SmartSimple Software Full timeResponsibilitiesEnsure the high availability, reliability, and performance of our SaaS products.Develop and implement automation scripts to improve operational efficiency.Collaborate with cross-functional teams to design, deploy, and maintain cloud-based infrastructure and applications.Build and maintain monitoring and alerting systems to detect and resolve...