Principal Engineer
8 hours ago
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
THE ROLE:
The Principal Engineer - DC GPU AI/ML Advanced Forward Deployment and Systems Engineering is a leadership position designed to optimize the design, roll-out and post-rollout management of AI/ML Fabrics. The candidate will be the technical interface between the customers and various internal engineering groups, field application engineers Leveraging extensive experience in large network architecture, Storage, AI/ML network deployments, and performance tuning, this role requires a disciplined approach to system triage, at-scale debug, and infrastructure optimization to ensure robust performance and efficient transitions from GPU production qualification to at-scale datacenter deployment.
THE PERSON:
This position is for a The Principal Engineer - DC GPU AI/ML Advanced Forward Deployment and Systems Engineering s Engineering with a focus on architecture, design, optimizing the compute, network, and storage; benchmarking the Machine Learning applications. You will be part of a team closely work with strategic customers and partners to enable large scale deployment of AMD CPU and GPU platforms. You will closely interface with ROCm software developers, DC GPU HW/FW/ASIC Teams, Field Engineering Teams, OEM/ODM partners, CSPs, and Marketing/Business Development teams. Must be self-motivated and possess the ability to work well within a team environment.
KEY RESPONSIBILITIES:
- Collaborate with strategic customers on scalable designs involving compute, networking, storage environment, work with industry partners, Internal teams to accelerate the deployment, adoption of various AI/ML models.
- Engage system-level triage and at-scale debug of complex issues across hardware, firmware, and software, ensuring rapid resolution and system reliability.
- Drive the ramp of Instinct-based large scale AI datacenter infrastructure based on NPI base platform hardware with ROCm, scaling up to pod and cluster level, leveraging the best in network architecture for AI/ML workloads.
- Enhance tools and methodologies for large-scale deployments to meet customer uptime goals and exceed performance expectations.
- Engage with clients to deeply understand their technical needs, ensuring their satisfaction with tailored solutions that leverage your past experience in strategic customer engagements and architectural wins.
- Provide domain specific knowledge to other groups at AMD, share the lessons learnt to drive continuous improvement.
- Engage with AMD product groups to drive resolution of application and customer issues
- Develop and present training materials to internal audiences, at customer venues, and at industry conferences
PREFERRED EXPERIENCE:
- Expertise in networking and performance optimization for large-scale AI/ML networks, including network, compute, storage cluster design, modelling, analytics, performance tuning, convergence, scalability improvements.
- Prefer candidates with solid, hands-on expertise in at least one or more of 3 domains, namely compute, network, storage.
- Demonstrated leadership in network architecture, hands on experience in RoCEv2 Design, VXLAN-EVPN, BGP, and Lossless Fabrics
- Deep experience in working with large customers such as Cloud Service Providers and global enterprise customers
- Proven leadership in engaging customers with diverse technical disciplines in avenues such as Proof of Concept, Competitive evaluations, Early Field Trials etc.
- Direct experience in working with large customers and can operate with sense of urgency, own the problems and resolve it
- Extensive experience in Python, Linux, Kernel modules, Application libraries, unless accompanied by other skill sets in the space.
- Proven ability to influence design and technology roadmaps, leveraging a deep understanding of datacenter products and market trends.
- Extensive hands-on Network deployment expertise and proven track record of delivering large projects on time. Cisco, Juniper or Arista Experience is required.
- Direct, co-development/deployment experience in working with strategic customers/partners in bringing solutions to market.
- Excellent communication level from engineer to mid-management to C-level of audience.
- This is a Senior level role; no recent college graduates will be considered.
ACADEMIC CREDENTIALS:
- Bachelors, master's in computer science , Engineering or related subjects of experience
- Ability to work well in a geographically dispersed team.
- Certifications in Networking, AI/ML, or Cloud Technologies.
#LI-BW2
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
-
Principal Pipeline Engineer
2 days ago
Ottawa, Ontario, Canada BMT Full time $140,000 - $165,000 per yearDepartment:Technical CapabilityLocation:CanadaDescriptionAbout the roleBMT is recruiting for a Principal Pipeline Engineer to join our experienced team to enhance the capacity and capability of the team.DutiesA Principal Pipeline Engineer at BMT will work as a senior member of the Infrastructure and Materials & Structures teams technical and management team...
-
Principal Optoelectronic Design Engineer
5 hours ago
Ottawa, Ontario, Canada Lumentum Full time $120,000 - $180,000 per yearIt's fun to work in a company where people truly BELIEVE in what they're doingWe're committed to bringing passion and customer focus to the business.If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with usLumentum Canada was awarded the 2022 National Capital Region's Top Employers for the 6th consecutive...
-
Associate Principal, Structural Engineer
1 week ago
Ottawa, Ontario, Canada Arup Full time $174,000 - $200,000 per yearStructural EngineeringAmericas RegionOTT00000IJoining ArupAt Arup we're dedicated to sustainable development and to doing socially useful work that has meaning. We solve the world's most complex problems and deliver what may seem impossible — with curiosity and creativity.We are seeking an Associate Principal, Structural Engineer to play a key role in...
-
Sr. Principal Test Engineer
2 days ago
Ottawa, Ontario, Canada Skyworks Solutions Full time $132,800 - $239,000 per yearIf you are looking for a challenging and exciting career in the world of technology, then look no further. Skyworks is an innovator of high-performance analog semiconductors whose solutions are powering the wireless networking revolution. Through our broad technology expertise and one of the most extensive product portfolios in the industry, we are...
-
Principal Engineer
1 week ago
Ottawa, Ontario, Canada Microchip Technology Inc. Full time $86,000 - $186,000 per yearAre you looking for a unique opportunity to be a part of something great? Want to join a 17,000-member team that works on the technology that powers the world around us? Looking for an atmosphere of trust, empowerment, respect, diversity, and communication? How about an opportunity to own a piece of a multi-billion dollar (with a B) global organization? We...
-
Ottawa, Ontario, Canada MSCI Full time $125,000 - $175,000 per yearAbout MarvellMarvell's semiconductor solutions are the essential building blocks of the data infrastructure that connects our world. Across enterprise, cloud and AI, automotive, and carrier architectures, our innovative technology is enabling new possibilities. At Marvell, you can affect the arc of individual lives, lift the trajectory of entire industries,...
-
Principal RF/EM Design Engineer
2 days ago
Ottawa, Ontario, Canada Skyworks Solutions, Inc. Full time $118,600 - $202,300 per yearIf you are looking for a challenging and exciting career in the world of technology, then look no further. Skyworks is an innovator of high-performance analog semiconductors whose solutions are powering the wireless networking revolution. Through our broad technology expertise and one of the most extensive product portfolios in the industry, we areConnecting...
-
Principal Engineer
14 hours ago
Ottawa, Ontario, Canada Microchip Technology Inc. Full time $86,000 - $130,000 per yearAre you looking for a unique opportunity to be a part of something great? Want to join a 17,000-member team that works on the technology that powers the world around us? Looking for an atmosphere of trust, empowerment, respect, diversity, and communication? How about an opportunity to own a piece of a multi-billion dollar (with a B) global organization? We...
-
Principal Program Manager
3 days ago
Ottawa, Ontario, Canada Microsoft Full timeMicrosoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world. Microsoft's Azure Data engineering team is leading the transformation of analytics in the world of data...
-
Program Manager- Principal Engineer
15 hours ago
Ottawa, Ontario, Canada Ranovus Full time US$80,000 - US$150,000 per yearRequisition for a Program Manager at RanovusRanovus is a leading provider of high-capacity photonics-based interconnect solutions that enable scale-up and scale-out compute networks for AI and cloud infrastructure. Our ODIN optical engine platform combines advanced silicon photonics, high-speed electronics, lasers, and packaging technologies to deliver...