Current jobs related to Staff Engineer, HPC Infrastructure - Canada - Tenstorrent
-
GCP DevOps HPC Engineer
3 weeks ago
, , Canada ExaTech Inc Full timeRemote (Canada) About the Role As a Senior DevOps-HPC Engineer at Xebia, you will join a dynamic Engineering team in a high-energy and collaborative environment. This role is ideal for a seasoned HPC engineer with deep expertise in SLURM, Linux, and cloud migration expertise in SLURM, Linux, and cloud migrations, who thrives on leading complex projects,...
-
Staff Software Engineer, GPU Infrastructure
18 minutes ago
Canada Cohere Full timeWho are we?Our mission is to scale intelligence to serve humanity. We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we...
-
Remote: Senior GCP HPC DevOps Engineer
3 weeks ago
, , Canada ExaTech Inc Full timeA tech company is seeking a Senior DevOps-HPC Engineer to lead the migration of HPC clusters to Google Cloud. The ideal candidate will have over 5 years' experience with SLURM and Linux-based systems, strong cloud migration skills, and exceptional problem-solving abilities. This role involves optimizing HPC solutions, automating deployments, and...
-
Staff Infrastructure Software Engineer
43 minutes ago
Remote, Canada Addepar Full timeThe Role We are currently seeking a Staff Software Engineer to join the AI Platform team to drive the design, architecture, and production posture of Addepar's AI Platform and our products and solutions. This team is at the center of Addepar's mission to integrate AI across our product suite and is growing quickly. This role focuses on building a scalable...
-
Data Infrastructure Engineer
5 days ago
, , Canada Meshy Full timeHeadquartered in Silicon Valley , Meshy is the leading 3D generative AI company on a mission to Unleash 3D Creativity by transforming the content creation pipeline. Meshy makes it effortless for both professional artists and hobbyists to create unique 3D assets—turning text and images into stunning 3D models in just minutes. Meshy is trusted by top...
-
Data Infrastructure Engineer
3 weeks ago
, , Canada MeshyAI Full timeJoin to apply for the Data Infrastructure Engineer role at MeshyAI Get AI-powered advice on this job and more exclusive features. About Meshy Headquartered in Silicon Valley , Meshy is the leading 3D generative AI company on a mission to Unleash 3D Creativity by transforming the content creation pipeline. Meshy makes it effortless for both professional...
-
Staff Infrastructure Software Engineer
5 minutes ago
Remote, Canada Addepar Full timeWho We AreAddepar is a global technology and data company that helps investment professionals provide the most informed, precise guidance for their clients. Hundreds of thousands of users have entrusted Addepar to empower smarter investment decisions and better advice over the last decade. With client presence in more than 50 countries, Addepar's platform...
-
Senior Software Engineer
34 minutes ago
Remote - Toronto, Ontario, Canada / Remote - Alberta, Canada Terawatt Infrastructure Full timeAbout Terawatt Infrastructure The once in a century transition to autonomous and electric vehicles is underway and will require a multi-trillion-dollar investment in energy and charging infrastructure, and the real estate to site it on. Terawatt is the leader in delivering large scale, turnkey charging solutions for companies rapidly deploying AV and EV...
-
Distinguished Engineer – Data Center Networking
45 minutes ago
Canada Nokia Global Full timeDescription Reporting directly to the CTO office, the Distinguished Engineer will lead the strategy and architecture of next-generation switches and networking solutions required for the artificial intelligence (AI) era. This role focuses on enabling high-performance, scalable, secure and reliable system solutions for both AI training and inferencing...
-
Research Engineer
9 hours ago
Aviva Way Markham, Ontario, LG B Canada Huawei Technologies Canada Co. Full timeJob description Huawei Canada has an immediate permanent opening for a Research Engineer.About the team:The Distributed Data Storage and Management Lab leads research in distributed data systems, aiming to develop next-generation cloud serverless products that encompass core infrastructure and databases. This lab addresses various data challenges, including...
Staff Engineer, HPC Infrastructure
1 day ago
Tenstorrent is leading the industry on cutting‑edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high‑performance RISC‑V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities. We’re seeking a Staff HPC Engineer who thrives on turning hundreds of bare‑metal compute nodes into consistent, production‑ready clusters through automation and infrastructure‑as‑code. You’ll design and maintain OS deployment pipelines that provision nodes in minutes, use Ansible to eliminate configuration drift across global sites, and ensure RHEL/Ubuntu systems stay performant and reliable as our compute demands scale exponentially. In semiconductor design, where millions of EDA jobs run daily, your automation work directly translates to faster design cycles and higher cluster utilization. This role is hybrid, based out of Austin, TX, Santa Clara, CA, or Toronto, CA. We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting. Who You Are Deep experience with IBM Spectrum LSF or similar workload managers. Strong background in commercial HPC storage platforms such as Pure Storage FlashBlade, Weka, NetApp, etc. Hands‑on experience with container technologies (Docker, Singularity, Podman). Solid Linux system administration skills. Understanding of HPC networking, storage architectures, and job scheduling. Ability to diagnose and resolve complex infrastructure issues independently. Comfortable working in a startup environment with rapidly changing requirements. What We Need Design and maintain automated bare‑metal provisioning pipelines that deploy hundreds of compute nodes globally with consistent configurations. Implement infrastructure‑as‑code practices using Ansible to manage large‑scale OS configuration across diverse hardware platforms. Own the lifecycle management of RHEL and Ubuntu systems—from initial deployment through patching, upgrades, and performance tuning. Build automation and tooling to streamline provisioning, patching, and system updates as the compute environment scales. Troubleshoot OS‑level issues, optimize kernel parameters, and resolve system performance bottlenecks that impact EDA workflows. Work directly with hardware design teams to standardize system configurations, toolchains, and development environments. Deploy and lifecycle manage systems across Tenstorrent’s global engineering sites, ensuring consistency and reliability. Nice to Have Experience supporting EDA tools and hardware design workflows in production HPC environments. Hands‑on expertise with commercial HPC storage platforms (Pure Storage, Weka, NetApp) and workload managers (LSF, Slurm). Container technologies (Docker, Singularity, Podman) for reproducible compute environments at scale. Advanced provisioning techniques (PXE boot, kickstart, cloud‑init) and modern infrastructure automation patterns. Cluster monitoring and observability tools (Prometheus, Grafana) for managing thousands of compute nodes. Security hardening and compliance frameworks for multi‑tenant semiconductor design environments. Integration of open‑source and commercial tools to improve provisioning efficiency and reliability. Work in a deeply technical environment solving infrastructure challenges that directly impact chip design velocity. Compensation for all engineers at Tenstorrent ranges from $100k – $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made. Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer. This offer of employment is contingent upon the applicant being eligible to access U.S. export‑controlled technology. Due to U.S. export laws, including those codified in the U.S. Export Administration Regulations (EAR), the Company is required to ensure compliance with these laws when transferring technology to nationals of certain countries (such as EAR Country Groups D:1, E1, and E2). These requirements apply to persons located in the U.S., and all countries outside the U.S. As the position offered will have direct and/or indirect access to information, systems, or technologies subject to these laws, the offer may be contingent upon your citizenship/permanent residency status or ability to obtain prior license approval from the U.S. Commerce Department or applicable federal agency. If employment is not possible due to U.S. export laws, any offer of employment will be rescinded. Seniority level Mid‑Senior level Employment type Full‑time Job function Information Technology Industries Computer Hardware Manufacturing #J-18808-Ljbffr