Skip to main content
Tallo logoTallo logo

Perception Data Pipeline Engineer

Job

O2 Technologies,Inc

Redwood City, CA (In Person)

Full-Time

Posted 1 day ago (Updated 10 hours ago) • Actively hiring

Expires 6/28/2026

Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
100
out of 100
Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

Job description About The Role Software Engineer, Perception Attributes Autolabeling Pipeline Onsite in Foster City, CA | at least 3 days in office The Perception Attribute Flywheel team is looking for a Software Engineer to build and operate the autolabeling pipeline that accelerates human annotation throughput on vehicle attribute classification tasks. Zoox is building a future for Riders, not drivers. The accuracy of our perception attribute models — recognizing emergency vehicles, school buses, brake lights, hazard signals, and more — depends on a steady flow of high-quality labeled examples drawn from our fleet's drive data. Today, every label is produced by a human annotator from scratch. We are building a pipeline that uses off-the-shelf foundation models (Gemini, Sig
LIP, CLIP
) to pre-label tasks, so human reviewers verify and correct rather than labeling from scratch. This role owns the pipeline engineering for that system: ingesting queued tasks from our annotator service, calling foundation-model APIs at fleet scale, writing structured predictions back into the labeling workflow, and operating the whole thing reliably. The team lead and supporting ML engineers own model selection, prompt design, and evaluation methodology; this role partners closely with them but is not expected to own those decisions. If you take pride in building reliable, observable, well-tested data pipelines and want to ship a system that visibly accelerates an autonomous vehicle program, you will excel in this role.
Responsibilities Build the autolabeling pipeline:
ingest queued tasks from the annotator service, dispatch them to foundation-model APIs (Gemini and others), parse structured outputs, and write pre-labels back to the labeling workflow Build the observability layer: per-task latency, per-model cost, per-attribute coverage, error-mode dashboards Run experiments designed by the team lead — set up the inputs, execute, collect outputs in formats the ML engineers can analyze Integrate the pipeline cleanly with existing Zoox systems, partnering with the data infrastructure team Document the system, write runbooks, and ensure a clean handoff at end of engagement Qualifications 3+ years of backend / data pipeline engineering experience Strong Python; comfort with C++ Large-dataset experience with PySpark or equivalent ML fundamentals — understanding of model inference, embeddings, structured output, and common eval metrics (precision, recall, calibration); able to reason about ML data shapes and integration patterns Experience integrating foundation-models (Gemini, OpenAI, Anthropic) at production scale Excellent written communication for design docs and runbooks Bonus Qualities — Experience With Any Of The Following Databricks End-to-end ML pipeline stewardship — owned an ML system in production from data ingest through inference through monitoring Annotation tooling or human-in-the-loop ML workflows Autonomous-systems data pipelines AWS, especially S3, ECS/EKS, Lambda Working in a codebase shared with ML engineers (proto schemas, joint deploys) Key Responsibilities & Skills Autolabeling Pipeline Development Vehicle Attribute Classification Data Flow Human-in-the-Loop Annotation Acceleration ML Model Inference Integration Observability & Monitoring of Data Pipelines Experimentation Support for ML Teams Documentation & Runbook Creation Cross-Team Integration with Data Infrastructure Technical Skills Python C++ PySpark / Spark AWS (S3 / ECS / EKS / Lambda) Databricks Foundation Model APIs (Gemini / OpenAI / Anthropic)
REST API
Integration Docker / Containerization Observability Dashboards EducationBachelor's Degree in Computer Science, Software Engineering, Electrical Engineering, Computer Engineering.
Preferred:
Master's in Computer Science, Master's in Artificial Intelligence, Master's in Machine Learning, PhD in Computer Science. Industry Experience Autonomous Vehicles Autonomous Driving Automotive Computer Vision Machine Learning Operations (MLOps) Data Engineering for AV #CareerOpportunities #JobVacancy #WorkWithUs Keywords data-pipeline non-disclosure-agreement-nda certificate-authority-ca flywheel-get flywheel gemini labeling packaging-labeling workflow machine-learning planning-and-design visual-art-design product-development-and-design assessment-assessment-tools autonomous-vehicles parse observability network-latency errors-omissions-e-o runbooks python cplusplus pyspark machine-learning-inference azure-databricks databricks human-in-the-loop-hitl amazon-web-services amazon-s3 amazon-ecs amazon-eks amazon-elastic-kubernetes-service-eks aws-lambda training-and-development runbook spark foundation-models application-programming-interface-api restful-api docker containerization education-training computer-science electrical-engineering-and-planning artificial-intelligence cyber-intelligence computer-vision image-recognition maintenance-repair-and-operations-mro machine-learning-ops-mlops data-engineering