Site Reliability System Admin

7 months ago


Quebec City, Canada Hewlett Packard Enterprise Full time

Site Reliability System Admin / Administrateur du système de fiabilité du site

Do you have a passion for invention and self-challenge? Do you grow with pushing the limits of what’s considered feasible? At Hewlett Packard Enterprise, you will have the power to make the most out of your career. Hewlett Packard Enterprise is one of the world’s largest and most successful IT Companies. We are successful not just because of the technology solutions that we deliver, but also because of our core values and the amazing people that we have. We invest in our employees’ personal growth & development in an environment that will challenge and reward them. Hewlett Packard Enterprise is filled with energetic people, sparking technology revolutions and creating the future to help improve the lives of every customer.

HPE is seeking a System Administrator to design, test and administer systems in support of the Supercomputing as a Service (SCaaS) business. This is an exciting opportunity to have a significant impact on a key business with considerable growth potential. In this role, you will have a great deal of creative freedom to define and develop solutions that will support a scaling customer base.

**This role will be performed onsite at the data center in Quebec City, Canada.**

**There will be weekend/off-hours on-call rotation for this position.**

**Primary Responsibilities**
- Ensure continuous uptime of HPC systems at large scale
- Provide system administration for our groundbreaking Supercomputing-as-a-Service system
- Creation of scripting and infrastructure as code to automate the support of cloud infrastructures and HPC-as-a-Service clusters
- Brings technical thinking to break down complex data and to engineer new ideas and methods for solving, prototyping, designing, and implementing cloud-based solutions
- Help design and implement security aspects of the computing infrastructure
- Administration of cloud based HPC systems
- Collaborates with project managers and development partners to ensure effective and efficient delivery, deployment, operation, monitoring, and support of HPC engagements

**Experience and Skills**
- Experience in Linux systems administration, planning, and maintenance
- An understanding of high-speed networks
- An understanding of the security concerns in a cloud environment
- Hands-on experience with **Linux administration** at scale
- Good communication skills
- Hands on experience with the tools and infrastructure to support **HPC systems **at scale including networking and storage
- An understanding of **high-performance computing**
- Proficient in the use and operation of **Linux-based environments** including shells, system configuration and administrative skills.
- Prior experience with large-scale clustered systems (preferably HPC experience with parallel compute systems)
- 5+ years of experience
- BS in Computer Science, IT Management, or equivalent

Join us and make your mark

**We offer**:

- A competitive salary and extensive social benefits
- Diverse and dynamic work environment
- Work-life balance and support for career development
- An amazing life inside the element Want to know more about it?

Then let’s stay connected

**Administrateur du système de fiabilité du site**

Vous avez une passion pour l'invention et l'auto-défi ? Évoluez-vous en repoussant les limites de ce qui est considéré comme faisable ? Chez Hewlett Packard Enterprise, vous aurez le pouvoir de tirer le meilleur parti de votre carrière. Hewlett Packard Enterprise est l'une des sociétés informatiques les plus importantes et les plus prospères au monde. Nous réussissons non seulement grâce aux solutions technologiques que nous proposons, mais également grâce à nos valeurs fondamentales et aux personnes formidables que nous avons. Nous investissons dans la croissance et le développement personnels de nos employés dans un environnement qui les mettra au défi et les récompensera. Hewlett Packard Enterprise est rempli de personnes énergiques, déclenchant des révolutions technologiques et créant l'avenir pour aider à améliorer la vie de chaque client.

HPE recherche un administrateur système pour concevoir, tester et administrer des systèmes à l'appui de l'activité Supercomputing as a Service (SCaaS). Il s'agit d'une opportunité passionnante d'avoir un impact significatif sur une entreprise clé avec un potentiel de croissance considérable. Dans ce rôle, vous aurez une grande liberté de création pour définir et développer des solutions qui prendront en charge une base de clients évolutive.

**Ce rôle sera exécuté sur place au centre de données de la ville du Québec.**

**Il y aura une rotation sur appel en fin de semaine / hors des heures régulières de travail pour ce poste.**

**Les responsabilités**
- Garantir une disponibilité continue des systèmes HPC à grande échelle
- Assurer l'administration du système pour notre système révolutionnaire



  • Quebec City, Canada Hewlett Packard Enterprise Full time

    Site Reliability System Admin / Administrateur du système de fiabilité du site Do you have a passion for invention and self-challenge? Do you grow with pushing the limits of what’s considered feasible? At Hewlett Packard Enterprise, you will have the power to make the most out of your career. Hewlett Packard Enterprise is one of the world’s largest...


  • Quebec City, Canada Hewlett-Packard CDS GmbH Full time

    Do you have a passion for invention and self-challenge? Do you grow with pushing the limits of what’s considered feasible? At Hewlett Packard Enterprise, you will have the power to make the most out of your career. Hewlett Packard Enterprise is one of the world’s largest and most successful IT Companies. We are successful not just because of the...


  • Quebec, Canada ALLTECH CONSULTING SVC INC Full time

    Level 4Job Description: The Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive the reliability engineering, operations, and customer support services for Company’s ServiceNow SaaS implementation. Reporting to a Site Reliability Engineering & Operations Lead. This role requires delivering a range of SRE...


  • Quebec City, Canada Simons Full time

    Job description **Site Reliability Expert (SRE)**: **IT**: **Québec**: **Simons Campus - IT**: - Full timeWant to be part of our IT team in a unique role that contributes to the production environment’s optimal maintenance? Join the Simons family as a Site Reliability Expert (SRE). **Key Responsibilities**: - Measure and optimize the system’s...


  • Montreal, Quebec, Québec, Canada Soho Square Solutions Full time

    Site Reliability Engineer (SRE) - ServiceNow, Application InfrastructureThe Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to drive reliability engineering, operations, and customer support services for a ServiceNow SaaS implementation. Reporting to a Site Reliability Engineering & Operations Lead, this role involves...


  • Montreal, Quebec, G4F, CA LanceSoft, Inc. Full time

    Site Reliability EngineerMontreal, Quebec, Canada HybridDuration: 12+ monthsResponsibilities: • Are interested in distributed systems and working with highly scalable and reliable services. • Like to work in a fast-moving environment and you aren't afraid to change things to make them better. • Enjoy new technological challenges and solving hard...


  • Montreal, Quebec, G4F, CA Experience AI Solutions Full time

    Senior Systems AdministratorStart Date: as soon as possibleType of employment: permanentLocation: Montreal, QC (hybrid model for working in the office)Number of Positions: 1Language skills: Excellent English language skillsPerks: Work for a multinational, award winning, socially responsible company with an operational presence in many countries, having been...


  • Montreal, Quebec, Québec, Canada LanceSoft, Inc. Full time

    Location : Montreal (Hybrid 3 days)Duration: 12+ MonthsJob ProfileSystems Reliability Engineering (SRE) is a discipline focused on improving system service availability, observability, scalability, performance, and resilience across *** by applying sound software engineering principles and adopting the latest technology and tooling.Responsibilities:Are...

  • Netbackup Admin

    6 months ago


    Quebec City, Canada hire IT people Full time

    **Role: NETBACKUP ADMIN** **Location: Montréal, Quebec, Canada** **Duration: Full-time** 2+ to 5 years’ experience Minimum Education - Bachelor's Degree **Job Category**: Information Technology - System Support The purpose of this job is to provide all aspects of NetBackup administration and provide technical guidance and support to junior (L1)...

  • Admin SystÈme Windows

    7 months ago


    Quebec City, Canada Gravity Conseil Full time

    Description Nous recherchons un Admin système Windows pour rejoindre notre équipe de conseillers du bureau de Québec, dans le cadre d'un poste permanent. Ce poste est en mode hybride. Pour les personnes qui souhaitent s'établir au Canada nous prenons en charge les procédures de permis de travail pour les non-résidents du Canada, quel que soit leur...


  • Montreal, Quebec, G4F, CA Zortech Solutions Full time

    Role Name: Cloud Site Reliability Specialist_Montreal Location: Montreal / Hybrid JOB DESCRIPTION: Years of experience : 5+ years Location: Montreal (Office attendance from Day 1 - Hybrid mode) Position Description: The Private Cloud SRE L3 team is part of the Enterprise Computing organization. The team has presence in cities globally and is focused on...

  • On-site Technician

    4 months ago


    Quebec City, Canada Pi Technology Group Full time

    **Job Summary**: **Key Responsibilities**: - **Smart Hands Support**: - Assist with the installation, configuration, and troubleshooting of IT equipment. - Perform tasks such as racking and stacking of devices, cable management, and equipment swaps. - Provide remote hands support for data centers and other facilities as needed. - **Retail Store IT Break...


  • Montreal, Quebec, G4F, CA National Bank Full time

    As a Specialist in site reliability engineering on the National Bank Data Protection team, you will ensure the operational reliability of data protection assets. With your experience and knowledge in the operational management of high-availability assets (HA), you will have a positive impact on the Bank's stability and reputation with its internal and...


  • Quebec City, Canada Hewlett Packard Enterprise Full time

    Data Center Operations Site Manager / Gestionnaire du site des opérations du centre de données Do you have a passion for invention and self-challenge? Do you grow with pushing the limits of what’s considered feasible? At Hewlett Packard Enterprise, you will have the power to make the most out of your career. Hewlett Packard Enterprise is one of the...


  • Quebec City, Canada Cielo IntExt Full time

    McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. We are known for delivering insights, products, and services that make quality care more accessible and affordable. Here, we focus on the health, happiness, and well-being of you and those we serve - we care. What you do at McKesson matters. We foster a...

  • System Administrator

    2 weeks ago


    Quebec City, Canada Paradocs Mountain Software Full time

    **_Does problem solving put stars in your eyes?_** **_ You like being the go-to person when clients need help?_** **_ Read on, Paradocs may have the job of your dreams._** **Who are we?** Paradocs is a tech start-up that develops an integrated solution (MtnOS) for the wonderful ski industry! We describe ourselves as a mountain partner for resorts....


  • Quebec City, Canada McKesson Canada Full time

    McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. We are known for delivering insights, products, and services that make quality care more accessible and affordable. Here, we focus on the health, happiness, and well-being of you and those we serve - we care. What you do at McKesson matters. We foster a...


  • Quebec City, Canada Braeriach Ltd Full time

    Job Description: As a Site Support Technician, you will play a crucial role in ensuring the smooth operation of our clients' IT infrastructure. Your responsibilities will include, but are not limited to: Site Support Includes: 1. **Hardware Troubleshooting**: Identify and resolve hardware issues on end-user devices. 2. **IMAC Services**: Coordinate and...


  • Quebec City, Canada Checkpoint Canada ULC Full time

    Work Term: Permanent - Work Language: English - Hours: 40 to 60 hours per week - Education: College/CEGEP - Experience: 2 years to less than 3 years - or equivalent experience **Work site environment**: - Dusty - Noisy - Outdoors **Work setting**: - Repair - Urban area - Rural area - Service - On-site customer service - Mechanical and service company -...

  • Site Supervisor

    7 months ago


    Quebec City, Canada CHC Full time

    **We Keep Your Helicopter Flying Safely, Whatever The Mission!**: - Heli-One is the leader in rotor wing maintenance, repair, and overhaul (MRO) with decades of expertise supporting a wide range of helicopters. Our global support network includes base maintenance; line maintenance; component and engine repair and overhaul; and Engineering and Design within...