Senior Solutions Architect, Cloud Infrastructure and DevOps

1 month ago


Old Toronto, Canada NVIDIA Full time

NVIDIA is the world leader in computer graphics, artificial intelligence, and accelerated computing. For over 25 years, we have been at the forefront of research and engineering around the greatest advances in technology. Our history of innovation drives us to solve the world's hardest problems.

NVIDIA is looking for a Senior Cloud Infrastructure/DevOps Solutions Architect to join its NVIDIA Infrastructure Specialist Team. Academic and commercial groups around the world are using NVIDIA products to revolutionize deep learning and data analytics, and to power data centers. Join the team building many of the largest and fastest AI/HPC systems in the world We are looking for someone with the ability to work on a dynamic customer-focused team that requires excellent interpersonal skills. This role will be interacting with customers, partners, and internal teams, to analyze, define, and implement large-scale Networking projects. The scope of these efforts includes a combination of Networking, System Design, and Automation, being the face to the customer

What you'll be doing:

  • Design, implement, and maintain large scale HPC/AI clusters with monitoring, logging, and alerting. Manage Linux job/workload schedulers and orchestration tools.
  • Develop and maintain continuous integration and delivery pipelines.
  • Develop tooling to automate deployment and management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources.
  • Deploy monitoring solutions for the servers, network, and storage.
  • Perform troubleshooting bottom up from bare metal, operating system, software stack, and application level.
  • Being a technical resource, develop, re-define, and document standard methodologies to share with internal teams. Support Research & Development activities and engage in POCs/POVs for future improvements.
  • Worldwide travel is required for on-site visits with customers.

What we need to see:

  • BS/MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields with at least 8 years work or research experience in networking fundamentals, TCP/IP stack, and data center architecture.
  • Knowledge of HPC and AI solution technologies from CPUs and GPUs to high-speed interconnects and supporting software.
  • Direct design, implementation, and management experience with cloud computing platforms (e.g. AWS, Azure, Google Cloud).
  • Experience with job scheduling workloads and orchestration technologies such as Slurm, Kubernetes, and Singularity.
  • Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu) networking (sockets, firewalld, iptables, wireshark, etc.) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc.
  • Experience with multiple storage solutions such as Lustre, GPFS, zfs, and xfs. Familiarity with newer and emerging storage technologies.
  • Python programming and bash scripting experience.
  • Comfortable with automation and configuration management tools including Jenkins, Ansible, Puppet/Chef, etc.
  • Deep knowledge of Networking Protocols like InfiniBand, Ethernet. Deep understanding and experience with virtual systems (for example VMware, Hyper-V, KVM, or Citrix).
  • Strong written, verbal, and listening skills in English are critical.

Ways to stand out from the crowd:

  • Knowledge of CPU and/or GPU architecture.
  • Knowledge of Kubernetes, container-related microservice technologies.
  • Experience with GPU-focused hardware/software (DGX, CUDA).
  • Background with RDMA (InfiniBand or RoCE) fabrics.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking individuals in the world working for us. If you're creative and autonomous, we want to hear from you.

The base salary range is 127,500 CAD - 279,500 CAD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

#J-18808-Ljbffr

  • Toronto, Ontario, Canada Epsilon Solutions Ltd. Full time

    Job Description:We are seeking a highly skilled Senior Cloud Solutions Architect to join our Application & Technology Partners (ATP) Team at Epsilon Solutions Ltd. in Toronto, ON.About the Role:Design, implement, and maintain scalable cloud infrastructure management and automation solutions using Python.Architect and deploy cloud infrastructure as code (IaC)...


  • Toronto, Ontario, Canada Intelliware Development Full time

    Intelliware DevelopmentWe are a business and technology consulting firm that thrives at the intersection of business and technology. We work differently here; you will find continuous opportunities to learn, grow, and share knowledge within your team and the expert communities at Intelliware all while delivering excellence.Job Title: Cloud and DevOps...


  • Toronto, Ontario, Canada Intelliware Full time

    Job TitleCloud and DevOps ArchitectAbout the RoleWe are seeking a skilled Cloud and DevOps Architect to join our team at Intelliware. As a key member of our technology team, you will play a crucial role in designing and implementing cloud solutions across various industries.ResponsibilitiesConsult with internal teams and clients on DevOps best practices to...


  • Toronto, Canada Tundra Technical Solutions Full time

    Job DescriptionTundra Technical Solutions seeks an experienced Cloud Data Infrastructure Architect to design and implement scalable and secure cloud data infrastructure solutions using Snowflake.Key Responsibilities:Design and deploy Snowflake environments on AWS, ensuring optimal performance and scalability.Implement Infrastructure as Code (IaC) practices...


  • Old Toronto, Canada Levio Full time

    Join the leader in digital transformation and boost your career at Levio. Experience the daily life of our clients and be a significant player in the most ambitious projects in technology and software solutions.We are seeking a Senior DevOps Solutions Engineer to help us in major cloud migration for clients in different industries such as banking or...


  • Toronto, Ontario, Canada CorGTA Full time

    Are you a seasoned DevOps professional with a passion for cloud infrastructure? We are seeking a highly skilled Senior DevOps Engineer to join our team at CorGTA in Toronto, ON. In this role, you will be responsible for designing, implementing, and managing robust cloud infrastructure solutions using AWS and GCP.Key ResponsibilitiesDesign and implement cloud...


  • Old Toronto, Canada Behaviour Interactive Inc. Full time

    We are seeking a seasoned Cloud Infrastructure Architect to join our Core Technologies team at Behaviour Interactive Inc. The successful candidate will be responsible for designing, building, and maintaining scalable cloud infrastructure on platforms like AWS to ensure high availability and performance of our systems.This is a senior role requiring at least...


  • Old Toronto, Canada Apply Digital Ltd. Full time

    About Apply Digital Ltd.We are a leading digital transformation partner, empowering businesses to modernize and thrive in the ever-evolving tech landscape.As a Senior Cloud Infrastructure Engineer at Apply Digital Ltd., you will be responsible for designing, building, and deploying scalable cloud infrastructure solutions that meet the evolving needs of our...


  • Toronto, Ontario, Canada CorGTA Full time

    CorGTA is seeking a highly experienced Senior DevOps Engineer to design, implement, and manage robust cloud infrastructure solutions. This role requires a detail-oriented and proactive engineer with extensive experience in AWS, GitLab, Terraform, Docker, and Kubernetes, as well as a solid understanding of cloud security and cost optimization.Salary: $135,000...


  • Toronto, Canada People Machine Full time

    We are looking for a talented Cloud Infrastructure Architect to join People Machine and work on our enterprise-scale cloud transformation project.This is a 6-month contract with potential renewals, offering a competitive salary of $120,000 per year.Job DescriptionAs a Cloud Infrastructure Architect, you will be responsible for designing and implementing...


  • Toronto, Ontario, Canada Sigmaways Inc Full time

    About Sigmaways Inc.We are a leading technology company seeking a highly skilled Senior Cloud Infrastructure Architect to join our team. In this role, you will play a critical part in shaping the future of our cloud infrastructure and ensuring the seamless delivery of our products.Job DescriptionThe successful candidate will have extensive experience in...


  • Toronto, Ontario, Canada UniUni Full time

    Are you passionate about building and maintaining scalable cloud infrastructure?We are seeking an experienced Cloud Infrastructure Architect to join our team at UniUni. As a key member of our DevOps team, you will play a pivotal role in designing and implementing reliable, secure, and efficient cloud infrastructure solutions.**Job Summary:**The ideal...


  • Old Toronto, Canada Richardson Wealth Ltd Full time

    About UsRichardson Wealth Ltd is a leading Canadian wealth management organization that offers a unique blend of personal service and big results. With offices across the country, we provide an ideal environment for top advisors and their high-net-worth clients.We aim to be the preferred choice for Canada's top advisors and are committed to reaching our...


  • Toronto, Canada Viva Tech Solutions Full time

    Job DescriptionViva Tech Solutions is seeking a Senior Cloud Solutions Architect to lead the design and implementation of scalable, secure, and efficient cloud-based systems.Key Responsibilities:Cloud Architecture: Design and implement cloud architecture solutions using Google Cloud Platform (GCP) technologies.Infrastructure as Code: Utilize Terraform to...


  • Toronto, Ontario, Canada Thomson Reuters Full time

    About the RoleIn this exciting opportunity as a Cloud Infrastructure Architect, you will play a pivotal role in designing and implementing scalable cloud infrastructure solutions for our clients.You will be responsible for collaborating with cross-functional teams to deliver high-quality cloud-based solutions that meet our clients' business needs.Your...


  • Toronto, Ontario, Canada Firmex Full time

    Firmex is a leading provider of cloud-based software solutions.Salary$120,000 - $180,000 per year, depending on experience and location.Job DescriptionWe are seeking an experienced Build and Release Architect to join our team. As a key member of our DevOps group, you will be responsible for providing technical direction for security, performance,...


  • Toronto, Ontario, Canada Motion Recruitment Full time

    We are Motion Recruitment, a global fintech company with operations in GTA. We are seeking an experienced Cloud Infrastructure Architect to join our DevOps team as a senior member.The ideal candidate will have extensive knowledge of designing, implementing and optimizing infrastructure on GCP cloud environments using Terraform, Kubernetes and Ansible. They...


  • Toronto, Ontario, Canada Momentum Financial Services Group Full time

    Join our team as a Cloud Infrastructure ArchitectWe are seeking an experienced Senior Cloud Engineer to join our central Cloud team. As a Senior Cloud Engineer, you will be responsible for the effective and efficient deployment and operation of infrastructure and applications that are fault-tolerant and secure, predominantly in an AWS cloud environment.This...


  • Toronto, Ontario, Canada Royal Bank of Canada Full time

    Secure Our FutureIn this exciting role as a Senior Cloud Security Solutions Architect at Royal Bank of Canada, you will be responsible for designing and developing innovative security solutions that ensure the safety and integrity of our cloud-based systems.About the OpportunityWe are seeking an experienced professional with a strong background in cloud...


  • Old Toronto, Canada TD Bank Full time

    We are seeking an experienced Senior Cloud Solutions Architect to join our team at TD Bank. This role will be responsible for delivering cloud architecture projects on time and within budget, assessing and developing current architecture and readiness for cloud implementation, and introducing new solutions and modernization strategies for the platforms.Key...