Head of AI/GPU Optimization Engineering
5 days ago
Kubex (formerly Densify) is building the future of autonomous, AI-driven infrastructure optimization. Our platform enables intelligent, policy-driven optimization across Kubernetes, cloud, and GPU-backed environments improving performance, reducing cost, and eliminating waste for some of the world’s most sophisticated technology organizations. As AI workloads increasingly run on Kubernetes, especially for inference at scale, Kubex is expanding its GPU support to provide more advanced optimization and automation aligned with the unique challenges of GPU-accelerated infrastructure. We combine deep systems expertise, advanced analytics, and patented optimization technology to help customers run AI workloads efficiently and reliably in real-world production environments. Role Overview Kubex is seeking a Head of AI Optimization Engineering to lead the technical direction and hands-on development of our AI infrastructure optimization capabilities. This is a senior, hands-on technical leadership role reporting directly to the CTO. You will act as a principal-level architect and engineer , owning the design and evolution of Kubex’s optimization solutions for Kubernetes-based environments running AI workloads, with a strong emphasis on GPU-accelerated inference . This role carries broad technical ownership and organizational influence, and we are looking for candidates interested in a position that provides both hands on and people-leadership opportunities. This role is ideal for someone who combines deep, practical experience with GPU infrastructure and Kubernetes with the ability to reason about system-level trade-offs, optimization strategies, & real-world customer environments, and who remains excited to write and ship production code. Key Responsibilities Technical Leadership & Architecture Own the technical vision and architecture for Kubex’s AI infrastructure optimization capabilities, with a focus on Kubernetes-based environments running GPU-accelerated workloads. Lead the design of systems that automate the optimization of resource configurations and allocations across containers, nodes, GPUs, and autoscaling groups. Serve as a senior technical authority within the organization, guiding architectural decisions and influencing broader engineering strategy. Contribute directly to production code, remaining deeply hands-on in the design, implementation, and evolution of core platform components. Collaborate closely with other senior engineers to coordinate and execute complex software development initiatives. Prototype, validate, and productionize new technical approaches related to AI workload optimization. GPU & AI Infrastructure Expertise Apply deep expertise in NVIDIA GPU ecosystems , including: CUDA and GPU programming models Tensor vs. non-tensor core trade-offs Multi-Instance GPU (MIG) configurations and advanced GPU sharing strategies Device plugins, telemetry, and instrumentation required to support optimization algorithms Understand how customers deploy and operate AI workloads in production, from container configuration through node-level and cluster-level design. Work with Kubernetes autoscaling technologies (e.g., native autoscaling, Karpenter, …) and understand their interaction with GPU-backed nodes. Work with Kubex’s existing optimization frameworks and patented technologies, quickly building fluency and contributing to their evolution. Collaborate with internal experts on optimization algorithms while bringing strong systems intuition and real-world constraints into solution design. Identify opportunities to extend Kubex’s value beyond inference workloads, including potential future optimizations for training or hybrid workloads. External & Cross-Functional Impact Partner with Product Management to translate customer needs and market opportunities into actionable technical solutions. Engage directly with customers on architecture and design discussions. Represent Kubex externally through technical discussions, thought leadership, and industry engagement as appropriate. Champion high standards for engineering quality, correctness, observability, and operational excellence. Embrace and promote the use of AI-assisted development tools and workflows to accelerate software delivery and improve developer effectiveness. Required Qualifications 10+ years of professional software engineering experience, including significant experience building complex, production systems. Deep, hands-on experience with GPU-accelerated infrastructure , particularly NVIDIA-based environments. Strong knowledge of Kubernetes, including how GPU-backed workloads are scheduled, scaled, and operated in real-world clusters. Practical experience with CUDA, GPU telemetry, and performance considerations for AI workloads. Proven ability to design and build systems that balance performance, cost efficiency, and operational reliability. Strong coding skills and a demonstrated commitment to remaining hands-on with production code. Excellent communication skills, with the ability to explain complex technical concepts to both internal and external audiences. Preferred Qualifications Experience optimizing or operating large-scale AI inference platforms. Familiarity with advanced GPU sharing strategies, including MIG, and their implications for scheduling and performance. Exposure to optimization-based systems, scheduling, bin-packing, or resource allocation problems. Experience working with autoscaling frameworks such as Kubernetes HPA/VPA or Karpenter. Background in high-performance computing, large-scale distributed systems, or AI platforms at scale. Experience mentoring or leading senior engineers, with interest in future people leadership. Why Join Kubex? Play a key role in shaping the future of AI infrastructure optimization. Work on technically challenging problems at the intersection of Kubernetes, GPUs, and AI workloads. Collaborate with a highly experienced, deeply technical team. Influence product direction, architecture, and external technical positioning. Flexible, remote-first culture focused on impact and innovation. Competitive compensation, equity, and benefits. #J-18808-Ljbffr
-
Chief AI
5 days ago
, , Canada Kubex Full timeA technology company specializing in AI infrastructure is seeking a Head of AI Optimization Engineering. The ideal candidate will lead the development of AI optimization solutions within Kubernetes environments, focusing on GPU-accelerated workloads. Responsibilities include technical leadership, system design, and collaboration with senior engineers....
-
GPU Cloud Platform Engineer
3 weeks ago
, , Canada Yotta Labs Full timeJoin to apply for the GPU Cloud Platform Engineer role at Yotta Labs . About Yotta Labs Yotta Labs is pioneering the development of a Decentralized Operating System (DeOS) for AI workload orchestration at a planetary scale. Our mission is to democratize access to AI resources by aggregating geo-distributed GPUs, enabling high-performance computing for AI...
-
AI Compiler Engineer
6 days ago
U.S., Canada, Germany, Norway EnCharge AI Full timeEnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge's robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today's best-in-class solutions. The high-performance architecture is coupled with seamless software...
-
AI Runtime Engineer
2 weeks ago
U.S., Canada, Germany, Norway EnCharge AI Full timeEnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge's robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today's best-in-class solutions. The high-performance architecture is coupled with seamless software...
-
Software Engineer – GPU
1 week ago
- Street Northwest Edmonton, Alberta, TG C Canada Huawei Technologies Canada Co. Full timeJob description Huawei Canada has an immediate 12-month contract opening for a Software Engineer. About the team:The Software-Hardware System Optimization Lab continuously improves the power efficiency and performance of smartphone products through software-hardware systems optimization and architecture innovation. We keep tracking the trends of...
-
Research Engineer
2 weeks ago
, , Canada Yotta Labs Full timeResearch Engineer - Decentralized AI Systems Join to apply for the Research Engineer - Decentralized AI Systems role at Yotta Labs. About Yotta Labs: Yotta Labs is pioneering the development of a Decentralized Operating System (DeOS) for AI workload orchestration at a planetary scale. Our mission is to democratize access to AI resources by aggregating...
-
, , Canada Yotta Labs Full timeA pioneering tech company is seeking a GPU Cloud Platform Engineer to join their team. This full-time role involves designing and operating extensive GPU infrastructure for AI workloads in cloud environments. The ideal candidate has a Bachelor's degree in Computer Science and several years of experience in Kubernetes management and cloud-native development....
-
Staff Software Engineer, GPU Infrastructure
3 weeks ago
, , Canada Cohere Full timeWho are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI. We obsess over what we...
-
Staff Software Engineer, GPU Infrastructure
6 days ago
Canada Cohere Full timeWho are we?Our mission is to scale intelligence to serve humanity. We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we...
-
Staff Software Engineer, GPU Infrastructure
6 days ago
Canada Cohere Full timeWho are we? Our mission is to scale intelligence to serve humanity. We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI. We obsess over what we...