Tallo logoTallo logo

AI Inference Infrastructure Software Engineer (Kubernetes / Cloud)

Job

Prime Team Partners

Medina, WA (In Person)

$190,000 Salary, Full-Time

Posted 4 days ago (Updated 16 hours ago) • Actively hiring

Expires 6/9/2026

Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
100
out of 100
Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

AI Inference Infrastructure Software Engineer (Kubernetes / Cloud) at Prime Team Partners AI Inference Infrastructure Software Engineer (Kubernetes / Cloud) at Prime Team Partners in Medina, Washington Posted in about 23 hours ago.
Type:
full-time
Job Description:
AI Inference Infrastructure Software Engineer (Kubernetes / Cloud) Seattle, WA (Hybrid - 3 days/week onsite)
Compensation:
Targeting $170K - $210K, meaningful start up equity A fast?moving AI engineering team is looking for an Inference Infrastructure Software Engineer to build and operate the Kubernetes and cloud backbone behind large?scale accelerated inference workloads. If you thrive at the intersection of distributed systems, cloud infrastructure, and high?performance AI, this role puts you right at the core of next?generation inference platforms. If you want to help push AI inference to its performance limits and build the infrastructure that makes it possible, we'd love to connect. What You'll Do Build and operate Kubernetes infrastructure powering large?scale inference services Run accelerated workloads with strict latency, throughput, and reliability requirements Manage AWS, GCP, and on?prem environments across networking, storage, IAM, and observability Develop automation and tooling in Python, Bash, and Go to streamline deployments and scaling Partner with ML, runtime, and hardware teams to productionize new inference capabilities Contribute to capacity planning, cost optimization, and reliability engineering Participate in on?call rotation for critical services What You Bring 3-5 years of hands?on Kubernetes experience (EKS, GKE, or self?hosted) 2-3 years operating production workloads on AWS or GCP Experience running ML or accelerated inference services at scale Strong skills in Python, Bash, and Go Deep understanding of GPU/accelerator scheduling, device plugins, and cluster performance Experience with IaC (Terraform/Pulumi), config management (Ansible/Puppet/Salt), and GitOps (Argo/Flux) Comfortable operating in fast?moving, early?stage environments Bonus Points Experience with inference servers (Triton, v
LLM, TGI
) Exposure to non?

GPU accelerators (FPGAs, ASICs) Background in SRE, observability, or performance engineering Experience building customer?facing API platforms Prime Team Partners is an equal opportunity employer. Prime Team Partners does not discriminate on the basis of race, color, religion, national origin, pregnancy status, gender, age, marital status, disability, medical condition, sexual orientation, or any other characteristics protected by applicable state or federal civil rights laws. For contract positions, hired candidates will be employed by Prime Team for the duration of the contract period and be eligible for our company benefits. Benefits include medical, dental and vision. Employees are covered at 75%. We offer a 401K after 6 months, we do not provide paid holidays or PTO, sick time is offered in accordance with local laws. This position is open until filled.

Similar remote jobs

Similar jobs in Medina, WA

Similar jobs in Washington