Site Reliability Expert
6 days ago
Job description
ITSite Reliability Expert (SRE)
Quebec
Simons Campus - IT
Full time
Are you looking to join our Information Technology team in a unique role that contributes to the optimal maintenance of our production environment? Join the Simons family as a Site Reliability Engineer (SRE).
The person in this role plays a key part in ensuring the smooth operation of our production environment by adopting a proactive, software-engineering-oriented approach. Reporting to the Director of Solution Architecture and Software Engineering, the SRE is responsible for ensuring the continuous availability of large-scale distributed software applications while maintaining high levels of performance and reliability.
Key Responsibilities:
- Provide primary operational support for multiple large-scale distributed software applications.
- Collect and analyze metrics from operating systems and applications to support performance optimization and incident troubleshooting.
- Measure and optimize system performance.
- Deliver infrastructure services using Infrastructure as Code (IaC).
- Maintain services that use the Operator Framework.
- Maintain and enhance continuous integration and continuous deployment (CI/CD) tools using ArgoCD and GitHub Actions.
- Automate IT operations tasks using Ansible.
- Participate in system design consultations, platform management, and capacity planning.
- Balance feature development velocity and reliability with well-defined service-level objectives.
- Collaborate with development teams to improve services through rigorous testing procedures.
- Build sustainable systems and services through automation and continuous improvement.
- Develop software and systems to manage platform infrastructure and applications.
Desired Profile:
- Bachelor's degree in computer science, software engineering, IT engineering, electrical engineering, or any other relevant field.
- At least two (2) years of experience in a role related to DevOps, SRE, platform engineering, or software engineering.
- Experience with Kubernetes, preferably Red Hat OpenShift.
- Experience with full-stack observability platforms such as Datadog and New Relic.
- Practical coding knowledge beyond simple scripting.
- Strong understanding of cloud-native approaches.
- Advanced programming skills (structured and object-oriented) using one or more high-level languages such as Java, Python, C/C++, Go, and JavaScript.
- Proactive approach to identifying issues, performance bottlenecks, and areas for improvement.
- Strong teamwork abilities and communication skills to work effectively with diverse stakeholders in a constantly evolving environment.
- Ability to communicate effectively in both French and English, spoken and written, in order to use systems and tools and carry out various tasks in English.
Benefits Available:
- A telemedicine service and Employee and Family Assistance Program.
- Group insurance plan and RRSP.
- Up to 40% off Simons purchases.
- Fitness area with changing rooms, group classes, and kinesiology services.
- Cafeteria service offering an extensive and affordable menu.
Simons Campus - IT
See job offers in this store
-
Experte, expert en fiabilité des sites
6 days ago
Quebec City, QC GR V, Canada SIMONS Full timede posteTechnologies de l'informationExperte, expert en fiabilité des sites (SRE)QuébecCampus Simons - informatiqueTemps pleinVous désirez intégrer notre équipe des technologies de l'information dans un rôle unique qui participe au maintien optimal de l'environnement de production? Joignez-vous à la grande famille Simons en tant qu'experte, expert en...
-
Site Reliability Expert
4 weeks ago
Quebec, Canada La Maison Simons Full timeJoin to apply for the Site Reliability Expert (SRE) role at La Maison Simons Are you looking to join our Information Technology team in a unique role that contributes to the optimal maintenance of our production environment? Join the Simons family as a Site Reliability Engineer (SRE). The person in this role plays a key part in ensuring the smooth operation...
-
Site Reliability Expert
1 week ago
Quebec, Canada La Maison Simons Full timeJoin to apply for the Site Reliability Expert (SRE) role at La Maison Simons Are you looking to join our Information Technology team in a unique role that contributes to the optimal maintenance of our production environment? Join the Simons family as a Site Reliability Engineer (SRE). The person in this role plays a key part in ensuring the smooth operation...
-
Senior Site Reliability Engineer
1 week ago
Quebec, Canada Orion Innovation Full timeThe Sr. SRE will be responsible for the reliability, scalability, and performance of systems supporting classified government projects in an air-gapped deployment. This role leverages advanced monitoring and DevOps tools to ensure uptime and compliance in a disconnected environment.Key ResponsibilitiesDesign and maintain highly reliable systems using RKE2,...
-
Site Reliability Engineer
4 weeks ago
Quebec (QC), Canada High Tech Genesis Inc. Full timeWE'RE HIRING!At HTG, youll push boundaries with the latest tech and collaborate with a team that loves what they do. Be part of a design services company that is amongst the companies that lead the world in technology and innovation.Your next chapter starts here.In this role, you will: Act as the main technical escalation point for first-level operations...
-
Site Reliability Engineer
1 week ago
Quebec, Canada ALLTECH CONSULTING SVC INC Full timeJob Description:Technology/Role/Department at our Company Enterprise Technology & Services (ETS) delivers shared technology services for the Firm supporting all business applications and end users. ETS provides capabilities for all stages of the Firm’s software development lifecycle, enabling productive coding, functional and integration testing,...
-
Site Reliability Engineer
7 days ago
Quebec, Canada ALLTECH CONSULTING SVC INC Full timeJob Description: Technology/Role/Department at our Company Enterprise Technology & Services (ETS) delivers shared technology services for the Firm supporting all business applications and end users. ETS provides capabilities for all stages of the Firm’s software development lifecycle, enabling productive coding, functional and integration testing,...
-
Site Reliability Engineer
1 week ago
Quebec, Canada ALLTECH CONSULTING SVC INC Full timeJob Description: Technology/Role/Department at our Company Enterprise Technology & Services (ETS) delivers shared technology services for the Firm supporting all business applications and end users. ETS provides capabilities for all stages of the Firm’s software development lifecycle, enabling productive coding, functional and integration testing,...
-
Senior Site Reliability Expert
7 days ago
Montréal, QC, Canada Lightspeed Commerce Full timeAre you actively seeking a new opportunity, or simply exploring the market? Either way, you might have just found the right place! We’re looking for a Senior SRE to join our Lightspeed Retail group in North America, a team responsible for multiple POS systems infrastructure and developer experiences. The team is at the helm of providing a stable, reliable...
-
Senior Site Reliability Expert
7 days ago
Montréal, QC, Canada Lightspeed Commerce Full timeAre you actively seeking a new opportunity, or simply exploring the market? Either way, you might have just found the right place! We’re looking for a Senior SRE to join our Lightspeed Retail group in North America, a team responsible for multiple POS systems infrastructure and developer experiences. The team is at the helm of providing a stable, reliable...