Lead Sustaining Debug Engineer

3 weeks ago


Markham, Canada AMD Full time

Job Description AMD is looking for a lead Sustaining Debug engineer to provide thought leadership, triage engagement, and subject matter expertise to our growing team. As a key contributor, you will have a strong technical background to contribute to all aspects of the sustaining debug process. The Datacenter Graphics and Accelerated Computing (DCGPU) organization is looking for an experienced system L3 sustaining debug engineer. Individual will be part of a global team that is to provide technical and functional debug skill supporting client environments: including electrical, power, networking and SOC. Responsibilities Debug / triage engineer and understanding of industry tools for root causing complex issues Understanding of GPU/System level HW and SW flow Ability to probe parts of a board; check electrical and power currents and validate a system Provide leadership for driving to root cause issues Communicate / Document flows and methods of bring-up, boot-up, system initialization and debug Lead technical presentations demonstrating a good understanding of application, data, infrastructure, architecture expertise and application systems design Collaborate with application, and infrastructure architects and be responsible for the defining-designing-delivering of the technical architectures, patterns, technical quality, risks, fitness for purpose and operability of technical architecture solutions Be a leader and mentor to the operation team; be hands-on and lead by example Be able to hand-on troubleshoot and solve the technical issues; own the problem and drive for resolution Able to proactively support team culture that fosters knowledge sharing, excellence, and collaboration Requirements Highly motivated hands-on leader with a strong development background, problem solving mentality, excellent communication skills, ability to prioritize tasks along with willingness to learn and adapt Experience in debugging of complex HW/FW issues is a must, understand the flow of a GPU through the different layers of a system and be able to validate the items connecting to the GPU SOC Communication Is essential in working with different owners of the functional code stack as well as the ability to drive issues via phone calls, chat messages, e-mails Hands on experience with Hardware in a DataCenter environment will be required Preferred Experience Significant experience in SoC and/or System debug of complex issues Develop / Document debug capabilities on a given SOC and System Go-to-person for debugging of issues for the Production level Platform validation Collaborate with internal teams on root causing issues, finding optimum resolutions Hands-on experience in using industry debug tools, scopes as well examine board level power Proven experience with C/C++ Demonstrable experience in facilitating Agile, Scrum or Kanban Skilled in scripting languages such as Perl, Ruby, and Shell script Proficient with revision control (GIT, SVN and CVS) Experience crafting and supporting cloud environments, including IaaS and PaaS Database development, PostgreSQL, Oracle, MS SQL Server Good balance of hardware, architecture, and software expertise Proven ability to drive resolution of critical problems within a lab, Datacenter Relationship with external customers/partners and able to help resolve problems in their Data Center Relationship with external customers/partners on ability to work manufacturing issues/failures Relationship with external customers/partners on ability to define rqmts for manufacturing validation Academic Credentials Bachelor’s/Master’s degree in Computer Science or related field strongly preferred + minimum 8 yrs experience in System or SOC level debug and triage AMD is an equal opportunity, inclusive employer and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. #J-18808-Ljbffr



  • Markham, ON, Canada AMD Full time

    WHAT YOU DO AT AMD CHANGES EVERYTHING We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our...


  • Markham, Canada AMD Full time

    Job DescriptionAMD is looking for a lead Sustaining Debug engineer to provide thought leadership, triage engagement, and subject matter expertise to our growing team. As a key contributor, you will have a strong technical background to contribute to all aspects of the sustaining debug process.The Datacenter Graphics and Accelerated Computing (DCGPU)...


  • Markham, Canada AMD Full time

    Job DescriptionAMD is looking for a lead Sustaining Debug engineer to provide thought leadership, triage engagement, and subject matter expertise to our growing team. As a key contributor, you will have a strong technical background to contribute to all aspects of the sustaining debug process.The Datacenter Graphics and Accelerated Computing (DCGPU)...


  • Markham, Ontario, Canada Advanced Micro Devices, Inc Full time $120,000 - $200,000 per year

    WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...


  • Markham, Ontario, Canada AMD Full time US$120,000 - US$180,000 per year

    WHAT YOU DO AT AMD CHANGES EVERYTHINGAt AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...


  • Markham, Canada Advanced Micro Devices Full time

    A leading technology company located in York Region, Markham, is seeking a GPU Platform Emulation Engineer. The role involves validating and debugging essential firmware and software on emulation models used in high-performance computing and machine learning products. Candidates should have solid experience in emulation platforms, strong debugging skills,...


  • Markham, Canada Advanced Micro Devices Full time

    A leading technology company located in York Region, Markham, is seeking a GPU Platform Emulation Engineer. The role involves validating and debugging essential firmware and software on emulation models used in high-performance computing and machine learning products. Candidates should have solid experience in emulation platforms, strong debugging skills,...


  • Markham, Canada Advanced Micro Devices Full time

    A leading technology company located in York Region, Markham, is seeking a GPU Platform Emulation Engineer. The role involves validating and debugging essential firmware and software on emulation models used in high-performance computing and machine learning products. Candidates should have solid experience in emulation platforms, strong debugging skills,...


  • Markham, Canada Advanced Micro Devices Full time

    A leading technology company located in York Region, Markham, is seeking a GPU Platform Emulation Engineer. The role involves validating and debugging essential firmware and software on emulation models used in high-performance computing and machine learning products. Candidates should have solid experience in emulation platforms, strong debugging skills,...


  • Markham, Canada AMD Full time

    A leading semiconductor company is seeking a Linux Platform Software Emulation Engineer in Markham, Ontario. This full-time role involves working on critical firmware and software validation, leading to innovation in AI and computing technologies. The ideal candidate should have a solid background in software engineering, debugging, and the ability to...