Site Reliability Engineer

12 hours ago


Montreal Quebec GF, CA Experience AI Solutions Full time

Senior Systems Administrator


Start Date: as soon as possible

Type of employment: permanent

Location: Montreal, QC (hybrid model for working in the office)

Number of Positions: 1

Language skills: Excellent English language skills

Perks: Work for a multinational, award winning, socially responsible company with an operational presence in many countries, having been in business for over 75 years. It is a culturally diverse environment, employing thousands of people around the world, with beautiful downtown Montreal offices, bonuses, flexible benefits, a pension plan, and access to world-class learning.


As a Senior System Administrator, you will solve compelling technical challenges by analyzing, troubleshooting, and architecting vital services, platforms, and infrastructures, always with reliability, scalability, resilience, security, and performance in mind. Therefore, you will understand the end-to-end configuration, technical dependencies, and general behavioral characteristics of the production services you will be supporting. On the other hand, you will be responsible for helping to maintain uptime and 24x7 availability of mission-critical, customer-facing production cloud services distributed across multiple regions. You will help create more consistent and automated button push environments at all levels, proactively test and tune all aspects of the infrastructure, streamline CI/CD processes, monitor, and respond to system notifications and alerts, and continuously work to optimize and improve the performance, security, and reliability of our systems.


Principal Duties and Responsibilities:

• Contribute to creating a culture of Site Reliability Engineering across the organisation by sharing best practices, approaches, documentation, and code with other engineering teams.

• Implement automation and software to tasks or parts of the system that would benefit from it or that are performed manually.

• Troubleshoot complicated, cross-platform, managing operating systems in a cloud-based SaaS and On Premises environments, handle live production incidents, debugging/troubleshooting, and infrastructure issues, following and applying best practices.

• Conduct system analysis, configuration management, and development of enhancements for performance, availability & reliability of system software.

• Design, write, ship, and drive the creation of software and systems to increase observability, product reliability, and organizational efficiency.

• Work closely with software engineers and testers to ensure that the system correctly addresses non-functional requirements such as performance, security, and availability.

• Document system knowledge as it is acquired over time, create run books, and ensure that critical system information is readily available to those who need it.

• Maintain and oversee the deployment, orchestration, servers and overall backend infrastructure.


Education: B.Tech./B.E. degree in Electronics & Telecomm or Computer Science.


Required Skills:

• Hands-on experience managing Windows 2012, 2016 and 2019 servers; Active Directory, Group Policy design and configuration.

• Significant experience in cloud computing infrastructure and Microsoft Azure platform.

• Capacity to provide advice, best practices and recommendations for the operation and deployment of Microsoft Azure

• Extensive experience in support / management of hypervisor-based products/infrastructure (VMware, Hyper-V)

• Previous experience as an administrator of Linux systems (e.g., CentOS, RedHat) and administration of command line systems such as Bash, VIM, SSH.

• Expertise in infrastructure performance monitoring and analysis using standard performance monitoring tools - (Nagios, Azure monitoring)

• Strong knowledge of Internet protocols and applications such as SMTP, DNS, HTTP, SSH, SNMP, etc.

• Hands-on experience in server farm configuration management (using tools such as Ansible, Terraform, etc.).

• Demonstrated knowledge of ITIL methodologies, ITIL v3 or v4 certification.



Administrateur système principal

Date de début : dès que possible

Type d'emploi : permanent

Lieu : Montréal, QC (modèle hybride pour le travail au bureau)

Nombre de postes : 1

Compétences linguistiques : Excellentes compétences en anglais

Avantages : Travailler pour une entreprise multinationale, primée et socialement responsable, présente opérationnellement dans de nombreux pays, avec plus de 75 ans d'expérience. Il s'agit d'un environnement culturellement diversifié, employant des milliers de personnes à travers le monde, avec de magnifiques bureaux au centre-ville de Montréal, des bonus, des avantages flexibles, un régime de retraite et un accès à des apprentissages de classe mondiale.


En tant qu'administrateur système principal, vous résoudrez des défis techniques captivants en analysant, dépannant et concevant des services, des plateformes et des infrastructures vitaux, toujours avec la fiabilité, la scalabilité, la résilience, la sécurité et les performances à l'esprit. Par conséquent, vous comprendrez la configuration de bout en bout, les dépendances techniques et les caractéristiques comportementales générales des services de production que vous soutiendrez. D'autre part, vous serez responsable de contribuer au maintien de la disponibilité et de la disponibilité 24x7 des services cloud de production critiques pour les clients, répartis dans plusieurs régions. Vous contribuerez à créer des environnements de bouton-poussoir plus cohérents et automatisés à tous les niveaux, testerez de manière proactive et réglerez tous les aspects de l'infrastructure, rationaliserez les processus CI/CD, surveillerez et répondrez aux notifications et alertes système, et travaillerez continuellement à optimiser et à améliorer les performances, la sécurité et la fiabilité de nos systèmes.


Principales tâches et responsabilités :

  • Contribuer à la création d'une culture d'ingénierie de fiabilité des sites à travers l'organisation en partageant les meilleures pratiques, les approches, la documentation et le code avec d'autres équipes d'ingénierie.
  • Mettre en œuvre l'automatisation et les logiciels pour les tâches ou parties du système qui en bénéficieraient ou qui sont effectuées manuellement.
  • Dépanner des incidents de production complexes, multiplateformes, gérant des systèmes d'exploitation dans des environnements SaaS basés sur le cloud et sur site, traiter des incidents de production en direct, déboguer/résoudre des problèmes et des problèmes d'infrastructure, en suivant et en appliquant les meilleures pratiques.
  • Effectuer une analyse système, la gestion de la configuration et le développement d'améliorations pour les performances, la disponibilité et la fiabilité des logiciels système.
  • Concevoir, écrire, expédier et conduire la création de logiciels et de systèmes pour accroître l'observabilité, la fiabilité du produit et l'efficacité organisationnelle.
  • Travailler en étroite collaboration avec les ingénieurs logiciels et les testeurs pour s'assurer que le système répond correctement aux exigences non fonctionnelles telles que la performance, la sécurité et la disponibilité.
  • Documenter les connaissances système au fur et à mesure de leur acquisition, créer des manuels d'exécution et veiller à ce que les informations critiques du système soient facilement accessibles à ceux qui en ont besoin.
  • Maintenir et superviser le déploiement, l'orchestration, les serveurs et l'infrastructure globale.


Éducation : Diplôme B.Tech./B.E. en Électronique et Télécommunications ou en Informatique.


Compétences requises :

  • Expérience pratique de la gestion des serveurs Windows 2012, 2016 et 2019 ; conception et configuration de l'Active Directory et des stratégies de groupe.
  • Expérience significative dans l'infrastructure informatique en nuage et sur la plateforme Microsoft Azure.
  • Capacité à fournir des conseils, des meilleures pratiques et des recommandations pour l'exploitation et le déploiement de Microsoft Azure.
  • Expérience approfondie dans le support/gestion de produits/infrastructures basés sur l'hyperviseur (VMware, Hyper-V).
  • Expérience antérieure en tant qu'administrateur de systèmes Linux (par exemple, CentOS, RedHat) et administration de systèmes en ligne de commande tels que Bash, VIM, SSH.
  • Expertise dans la surveillance des performances de l'infrastructure et l'analyse à l'aide d'outils standard de surveillance des performances (Nagios, Azure monitoring).
  • Solide connaissance des protocoles Internet et des applications telles que SMTP, DNS, HTTP, SSH, SNMP, etc
  • Expérience pratique dans la gestion de la configuration de la ferme de serveurs (en utilisant des outils tels qu'Ansible, Terraform, etc.).
  • Connaissance démontrée des méthodologies ITIL, certification ITIL v3 ou v4.



  • Montreal, Quebec, G4F, CA National Bank Full time

    As a Specialist in site reliability engineering on the National Bank Data Protection team, you will ensure the operational reliability of data protection assets. With your experience and knowledge in the operational management of high-availability assets (HA), you will have a positive impact on the Bank's stability and reputation with its internal and...


  • Montreal, Quebec, G4F, CA National Bank Full time

    A career as a Site Reliability Engineering Specialist in the Transactional Banking APIs team at National Bank means acting as a specialist ensuring the operational reliability of the Transactional Banking APIs assets. Through your experience and knowledge in operational management of high-availability assets (e.g., SLO), you positively impact the bank's...

  • Industrial Engineer

    2 months ago


    Montreal, Quebec, G4F, CA Fed Manutech Full time

    Hello, I'm Julie Baptista, a recruitment specialist for temporary and permanent positions in engineering, technology, and manufacturing at Fed Manutech. We are experts and speak your language. We are committed to supporting you throughout your job search and every step of your career.I am currently recruiting for one of my clients, a company specializing...


  • Montreal, Quebec, G4F, CA Fordia Full time

    It all starts with people. People like you. Epiroc (Fordia) is an international manufacturing company operating in the mining exploration sector. Fordia products offer high-quality solutions, including diamond tools, rods and a variety of drilling accessories. Our products and our team are recognized worldwide, and we work to continuously improve exploration...


  • Montreal, Quebec, G4F, CA Fed Manutech Full time

    Hello, I'm Julie Baptista, Fed Ingénierie's temporary and permanent recruitment specialist for the engineering and manufacturing professions. We are experts and speak your language. We are committed to supporting you throughout your job search and at every stage of your career.I am currently recruiting an Electrical/Automation Engineer - 40H/week...


  • Montreal, Quebec, G4F, CA National Bank Full time

    A career as a Site Reliability Engineer (SRE) in the Digital Channels team at National Bank means acting as a specialist in the reliability, efficiency, and performance of systems supporting applications used by nearly 2 million clients. Through your strong technical skills, teamwork abilities, effective communication, collaborative problem-solving, and...


  • Montreal, Quebec, G4F, CA The Diverse North inc. Full time

    My client in Montreal is looking for a Senior Advisor, Process Safety for their operational and functional safety sites. The successful person will ensure the reporting of key indicators on the state of health and process safety to the Management Committee. In collaboration with the Minerals sites, they will need to develop an implementation plan for the...


  • Montreal, Quebec, G4F, CA Fed Manutech Full time

    Hello, I'm Julie Baptista, Fed Manutech's specialist in temporary and permanent recruitment for engineering, engineering and manufacturing professions. We are experts and we speak your language. We are committed to supporting you throughout your job search and at every stage of your career.I am currently recruiting for one of my clients, a company...


  • Montreal, Quebec, G4F, CA Synechron Full time

    Nous sommesSynechron est un cabinet de conseil leader mondial en transformation numérique, axé sur les services financiers et les organisations technologiques. Nos spécialités incluent l'intelligence artificielle de bout en bout, le conseil, le numérique, le cloud & DevOps, les données et l'ingénierie logicielle. Nos 13 FinLabs servent de hubs...


  • Montreal, Quebec, G4F, CA Omniply Full time

    Job Title: Principal MicroLED ScientistExperience: 2+ yearsEmployment Type: Full-timeOmniply is a Montreal based startup with a revolutionary manufacturing technology for making flexible electronics. We are a dynamic team of scientists and engineers, navigating the fast-paced startup environment to constantly innovate and make an impact in the market.Job...

  • Project Manager

    3 days ago


    Montreal, Quebec, G4F, CA Jump! Recruteurs Full time

    Reporting to the Project Director, the Project Manager is responsible for the complete management of deconstruction and decontamination projects. He/she oversees all stages, from planning to delivery, while respecting costs, deadlines and quality. He/she represents the company to clients and coordinates the resources required at each stage.Qualifications and...


  • Montreal, Quebec, G4F, CA Canadian National Railway Full time

    At CN, we work together to move our company-and North America-forward. Be part of our Information & Technology (I&T) team, a critical piece of the engine that keeps us in motion. From enterprise architecture to operational technology, our teams use the agile methodology to automate and digitize our railroad ensuring our operations run optimally and safely,...

  • Plant Head

    12 hours ago


    Montreal, Quebec, G4F, CA Lotus Pharmaceutical Co., Ltd Full time

    KEY RESPONSIBILITIESPlan, organize and manage the site, ensure timely production, product quality, plant safety and regulatory compliance. Establish, maintain and improve site facility, equipment and utility capability. Setup and improve department working flow, continuously improve productivity and operational efficiency, reduce operational cost. Assure...


  • Montreal, Quebec, G4F, CA National Bank Full time

    .This position supports Electronic Trading and Market Making at National Bank Financial Group. As a ULL Network Analyst / Electronic Trading Analyst, you will work with our developers, quants, traders and technology experts to implement, test and continuously deliver new features and products to our customers. The position will be located in Montreal or in...

  • AWS Developer

    2 weeks ago


    Montreal, Quebec, G4F, CA Fed IT Full time

    Hello,I'm Clémence, recruitment and business development consultant at FED IT, a recruitment agency specializing in IT professions.I work on two types of recruitment: temporary and permanent.All our consultants are IT experts who speak your language and work in your environment. We cover the IT, development, business intelligence and infrastructure...


  • Montreal, Quebec, G4F, CA Canadian National Railway Full time

    At CN, we work together to move our company-and North America-forward. Be part of our Information & Technology (I&T) team, a critical piece of the engine that keeps us in motion. From enterprise architecture to operational technology, our teams use the agile methodology to automate and digitize our railroad ensuring our operations run optimally and safely...


  • Montreal, Quebec, G4F, CA Domtar Full time

    Domtar is a leading producer of pulp, paper, packaging, tissue and wood products. Through our focus on safety and sustainability, as well as our commitment to operational excellence and our network of facilities across Canada and the United States, Domtar delivers high-quality and cost-effective products to customers around the world. Our workforce is...


  • Montreal, Quebec, G4F, CA Canadian National Railway Full time

    At CN, we work together to move our company-and North America-forward. Be part of our Information & Technology (I&T) team, a critical piece of the engine that keeps us in motion. From enterprise architecture to operational technology, our teams use the agile methodology to automate and digitize our railroad ensuring our operations run optimally and safely...


  • Montreal, Quebec, G4F, CA Fed Supply Full time

    Hello ! I'm Jérémy, Recruitment Consultant for the Fed Supply employment agency, specialist in the fields of supply chain, logistics, transportation and customer service - offering temporary and permanent jobs in the Greater Montreal area. Our team of Supply Chain and Logistics experts speaks your language and operates in your world.Hello ! I'm...


  • Montreal, Quebec, G4F, CA Fed IT Full time

    Are you looking for a new professional challenge? Does enterprise application management no longer hold any secrets for you? Do you want to join a company that combines high standards, performance and kindness? So take 5 minutes to read this ad, your future may be at the bottom of this offer! First of all, let me introduce myself, I am Earvin from the Fed IT...