Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Translucent Services, LLC is seeking a highly experienced and motivated Senior High-Performance Computing (HPC) System Administrator to join our team in Stennis AFB, Mississippi maintenance, and optimization of our large-scale, cutting-edge HPC cluster environments. This is a critical, senior-level role that requires deep expertise in Linux operating systems , cluster management technologies, and a proven ability to maintain systems with high-availability and security standards for mission-critical operations. Active
TS/SCI U.S.
Security Clearance is required•
Key Responsibilities System Management & Maintenance:
Lead the installation, configuration, testing, and maintenance of Linux-based HPC clusters (e.g., using CentOS, RHEL, or similar distributions). Administer and troubleshoot cluster scheduling and resource management systems (e.g., SLURM, PBS Pro, or LSF ). Manage and maintain high-speed interconnects (e.g., InfiniBand, Omni-Path ) and associated network infrastructure.
Storage & File Systems:
Configure, maintain, and optimize parallel and networked file systems ( Lustre, GPFS/Spectrum Scale, or NFS ). Monitor and manage large-scale storage hardware to ensure data integrity and performance.
Performance Optimization:
Proactively monitor system performance and capacity, conducting tuning and optimization of the HPC environment for maximum efficiency. Work with end-users and developers to diagnose and resolve complex performance bottlenecks in scientific and computational workflows.
Security & Compliance:
Ensure the HPC infrastructure adheres to all security policies and governmental compliance requirements, specifically leveraging experience with Active TS/SCI cleared environments . Manage user accounts, access controls, and security patches across the entire system.
Troubleshooting & Documentation:
Provide expert-level, Tier 3 support for all HPC-related issues. Develop and maintain comprehensive documentation for all cluster configurations, procedures, and service records.
Mentorship & Leadership:
Act as a technical leader, mentoring junior staff and guiding the adoption of best practices in HPC system administration.
Required Qualifications Clearance:
Must possess an Active
TS/SCI U.S.
Security Clearance .
Experience:
Minimum of 8+ years of progressive experience in Linux System Administration, with at least 5 years focused specifically on High-Performance Computing (HPC) clusters .
Technical Expertise:
Expert proficiency in Linux system administration and scripting (e.g., Bash, Python). Demonstrable expertise with HPC job schedulers (SLURM, PBS, or LSF). Deep understanding and practical experience with high-performance parallel file systems (Lustre, GPFS/Spectrum Scale). Strong networking knowledge, particularly high-speed fabrics like InfiniBand.
Soft Skills:
Excellent written and verbal communication skills, with a proven ability to work collaboratively in a team environment. Preferred Qualifications (Nice to Have) Experience with containerization and orchestration technologies (Docker, Singularity/Apptainer, Kubernetes). Familiarity with configuration management tools (Ansible, Puppet, SaltStack). Experience supporting scientific applications, compilers (GCC, Intel), and MPI libraries (Open
MPI, MPICH
). Relevant certifications (e.g., Red Hat Certified System Administrator/Engineer - RHCSA/RHCE).
Job Type:
Full-time Pay:
$85,000.00 - $110,000.00 per year
Benefits:
401(k) Dental insurance Health insurance Life insurance Paid time off Vision insurance Application Question(s): Do you have an active TS/SCI clearance?