Data Center GPU Commissioning Engineer
Job
Compugra Systems
San Jose, CA (In Person)
Full-Time
Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
76
out of 100
Average of individual scores
Skill Insights
Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Role:
Data Center GPU Commissioning Engineer Location:
San Jose CA (100% Onsite) 12months Job Description The Data Center GPU Commissioning Engineer is responsible for commissioning, validating, and stabilizing GPUbased infrastructure in data center environments. This role ensures GPU servers, interconnects, drivers, firmware, and platform software are correctly installed, configured, tested, and productionready to support AI, ML, and HPC workloads. The engineer works closely with Deployment, Network, Platform, and Operations teams to deliver reliable, highperformance GPU clusters and ensure smooth handover to run operations. Key Responsibilities Perform endtoend commissioning of GPU servers and clusters in data centers. Validate hardware installation, power, cooling, and cabling readiness for GPU systems. Install and configure GPU drivers, firmware, BIOS settings, and system software. Verify GPU health, performance, and stability using standard validation and burnin tests. Validate highspeed interconnects and networking used for GPU workloads. Execute clusterlevel testing forAI / HPC
readiness and baseline performance. Identify, troubleshoot, and resolve hardware, driver, or configuration issues during commissioning. Work with OEMs and vendors for issue resolution and firmware recommendations. Ensure systems comply with security, hardening, and operational standards. Document commissioning procedures, results, and asbuilt configurations. Support handover to operations teams and assist during earlylife stabilization. Required Skills & Experience Technical Skills Handson experience with GPUbased servers in data center environments Strong understanding of: Linux system administration GPU drivers, firmware, and system tuning Server BIOS, firmware upgrades, and hardware diagnostics Familiarity with data center networking concepts and highperformance interconnects Exposure toAI / ML / HPC
environments is strongly preferred Operational Skills Strong troubleshooting and root cause analysis skills Experience working in structured deployment and commissioning processes Ability to follow and improve runbooks and SOPs Certifications (Preferred) OEM server certifications (HPE / Dell / Lenovo or equivalent) Linux administration certificationsGPU / AI
platform certifications (nice to have)Similar remote jobs
UnitedHealth Group
Fort Wayne, IN
Posted2 days ago
Updated13 hours ago
Similar jobs in San Jose, CA
Anywhere Real Estate
San Jose, CA
Posted2 days ago
Updated13 hours ago
Integration Architects, Inc
San Jose, CA
Posted2 days ago
Updated13 hours ago
Apple Inc.
San Jose, CA
Posted2 days ago
Updated13 hours ago
Similar jobs in California
Freedom Village of Holland Michigan
San Diego, CA
Posted2 days ago
Updated13 hours ago