Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Apply Offsite

Software Engineer (AI Infrastructure / Training / Inference)

Job

SpreeAI

San Francisco, CA (In Person)

Full-Time

Posted 2 days ago (Updated 15 hours ago) • Actively hiring

Expires 7/24/2026

See Job Scorecard

Review key factors to help you decide if the role fits your goals.

How is this calculated?

Pay Growth

out of 5

Not enough data

Not enough info to score pay or growth

Job Security

out of 5

Not enough data

Calculating job security score...

Total Score

out of 100

Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

Software Engineer (AI Infrastructure / Training / Inference) SpreeAI San Francisco, CA Job Details 6 hours ago Qualifications Containerization systems Data structures Cloud infrastructure implementation Algorithm design Distributed computing Full Job Description About the Role We are hiring Software Engineers focused on AI Infrastructure to build the systems that enable frontier multimodal AI to operate reliably at production scale. This role exists because modern generative and vision models require infrastructure beyond traditional backend engineering — including GPU orchestration, large-scale inference systems, performance optimization, and developer platforms that allow applied scientists to move fast without sacrificing reliability or cost efficiency.

You will work on:

Scalable model serving and inference pipelines. Distributed GPU infrastructure. Performance and cost optimization. Reliability, observability, and production readiness. You will operate at the boundary between systems engineering and machine learning — building the "paved roads" that allow advanced AI systems to scale safely and efficiently. What you'll do Design and build scalable infrastructure supporting training and inference workflows. Develop high-performance APIs and backend services for AI model serving. Optimize GPU utilization, latency, and throughput for multimodal workloads. Build distributed systems supporting large-scale generative models. Improve observability, monitoring, and reliability of AI systems. Partner closely with Applied Science teams to productionize research systems. Drive improvements in deployment workflows, automation, and platform usability. Qualifications Degree in Computer Science, Engineering, or comparable combination of education and practical experience. Strong object-oriented programming skills (Python, C++, Java, Go, or similar). Strong data structures and algorithms foundations. Experience building production backend or distributed systems. Understanding of cloud infrastructure concepts and containerized systems. Preferred Qualifications Experience with Kubernetes, Docker, or container orchestration. Familiarity with GPU-based ML workloads or distributed training/inference systems. Experience with model serving frameworks (vLLM, Triton, Ray Serve, or similar). Experience with observability tools and performance debugging. Familiarity with PyTorch or ML workflows. Interest in optimizing systems for efficiency, scalability, and developer velocity. SPREEAI is a fast-growing, innovative AI company at the forefront of fashion and e-commerce, revolutionizing how consumers engage with fashion through lifelike photorealistic try-on technology and hyper-personalized shopping experiences. Our mission is to redefine the retail landscape with cutting-edge AI solutions that blend high fashion and technology. We thrive in a dynamic, fast-paced environment where creativity meets technology to drive real impact. If you are passionate about innovation and shaping the future of fashion, SPREEAI offers a platform to make your mark.