Site Reliability Engineer (On Prem)
Job
Insight Global
Santa Clara, CA (In Person)
$140,400 Salary, Full-Time
Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
100
out of 100
Average of individual scores
Skill Insights
Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Job Description Insight Global is looking for a Site Reliability Engineer (on prem) to join one of our largest clients in the Bay Area.
To learn more about how we collect, keep, and process your private information, please review
This person will:
- Provide day-to-day operational support for production and pre-production environments
- Administer and support Jenkins (user management, pipeline reliability, upgrades, troubleshooting)
- Manage and operate Kubernetes clusters at scale (deployments, scaling, upgrades, monitoring)
- Maintain system reliability, uptime, and performance through proactive monitoring and incident response
- Participate in on-call rotations to support critical systems and resolve production issues
- Work closely with engineering teams to support CI/CD and platform reliability
- Support on-site operations 3-4 days per week (collaboration with infra and hardware teams) This role can pay between $60
- 75/hour depending on years of experience + skillset.
To learn more about how we collect, keep, and process your private information, please review
Insight Global's Workforce Privacy Policy:
https://insightglobal.com/workforce-privacy-policy/. Skills and Requirements- 5+ years of experience in SRE, DevOps, or Systems Engineering roles
- Strong hands-on experience with Kubernetes in production
- Strong hands-on experience with Jenkins administration and CI/CD operations
- Experience supporting Linux-based systems in high-availability environments
- Comfort operating and troubleshooting complex infrastructure under SLA pressure
- Experience supporting GPU-based infrastructure (NVIDIA GPUs or similar)
- Hands-on exposure to GPU scheduling, health monitoring, and workload reliability
- Supporting ML/AI, compute-intensive, or accelerator-based workloads is a strong plus
- Familiarity with GPU drivers, firmware, and integration within Kubernetes environments preferred
Similar remote jobs
Maximus
Pierre, SD
Posted1 day ago
Updated3 hours ago
Similar jobs in Santa Clara, CA
Crestview Family Dental
Santa Clara, CA
Posted1 day ago
Updated3 hours ago
Palo Alto Networks
Santa Clara, CA
Posted1 day ago
Updated3 hours ago
Amazon
Santa Clara, CA
Posted1 day ago
Updated3 hours ago
Palo Alto Networks
Santa Clara, CA
Posted1 day ago
Updated3 hours ago
Similar jobs in California
Equitable Advisors
Folsom, CA
Posted1 day ago
Updated3 hours ago
Stanford Health Care
Palo Alto, CA
Posted1 day ago
Updated3 hours ago