Skip to main content
Tallo logoTallo logo

Software Engineer: ML Optimization

Job

Seer

Santa Rosa, CA (In Person)

Full-Time

Posted 3 days ago (Updated 1 day ago) • Actively hiring

Expires 7/1/2026

Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
100
out of 100
Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

Software Engineer:
ML Optimization at
Seer Software Engineer:
ML Optimization at Seer in Santa Rosa, California Posted in about 16 hours ago.
Type:
full-time
Job Description:
ML Systems Engineer - Training & Inference Optimization (MBMB) We are building large-scale embodied intelligence systems designed to operate in complex real-world environments. Our work spans robot foundation models, high-performance training infrastructure, and on-device inference systems that run directly on robotic hardware. We are seeking ML Systems Engineers to optimize both training and on-robot inference stacks. This role is focused on pushing performance boundaries across hardware, software, and model design - where improvements are still step-function rather than incremental. Internally, this team is known as MBMB (More Big More Better) . What You'll Do Push Training and Inference Performance to the Limit Optimize both large-scale training systems and on-robot inference stacks Deliver meaningful, step-function improvements in throughput, latency, and efficiency Improve end-to-end system performance across distributed training and deployment environments Make GPUs Perform at Maximum Efficiency Identify and remove bottlenecks across the full compute stack Optimize GPU utilization across training and inference workloads Improve performance of transformer and diffusion-based architectures under real-world constraints Engineer Across the Full Stack Implement ML, hardware-aware, and software-level optimizations that materially improve system performance Work across: CUDA kernels and low-level GPU execution ML model architecture and compute efficiency CPU bottlenecks and data pipelines Network and distributed systems performance (NVLink, interconnects, and cluster communication) Python, NumPy, and PyTorch-level inefficiencies Drive System-Level Improvements Evaluate and implement changes that lead to measurable gains in training and inference efficiency Collaborate with ML researchers and systems engineers to identify high-leverage optimization opportunities Continuously profile, benchmark, and improve system performance across evolving workloads What We're Looking For Strong experience with performance optimization in ML systems Up-to-date knowledge of modern training and inference techniques for transformer and diffusion models Ability to reason across the full stack, including: GPU and CUDA-level optimization Model architecture efficiency CPU, memory, and I/O bottlenecks Distributed networking and communication overhead Framework-level performance (PyTorch, NumPy, Python) Strong systems intuition and ability to identify bottlenecks quickly Comfort operating in fast-moving environments where large performance gains are still available Preferred Experience Experience optimizing large-scale training or inference systems Deep familiarity with GPU programming and kernel optimization Experience working with distributed ML systems at scale Exposure to model architecture-level efficiency improvements Background spanning both systems engineering and machine learning Why This Role Matters Direct impact on both training speed and real-time robot performance Work on problems where improvements are still large and measurable Shape the efficiency and scalability of next-generation embodied intelligence systems Operate across the full stack - from hardware execution to model design About the Company We are a research-driven AI and robotics company focused on building scalable embodied intelligence systems. By combining advances in machine learning, systems engineering, and robotics, we aim to push the frontier of efficient, real-world AI. We are committed to building an inclusive and diverse workplace and encourage applicants from all backgrounds to apply.