Job Description
Program Lead:
Product Operations - AI Observability Uber - 3.4 Sunnyvale, CA Job Details Full-time $162,000 - $180,000 a year 1 day ago Benefits 401(k) Qualifications Program management Full Job Description About the Role The AI Observability Program Leader will own the end-to-end strategy, design, and implementation of the frameworks used to monitor, understand, and improve Uber's GenAI-powered agentic systems. This role sits within the Global Digital Experience team, the operational arm of Uber's customer support tech organization, and is a critical driver of accuracy, safety, and reliability across Uber's next-generation AI solutions. This leader will bridge the gap between raw AI logs and actionable product insights. You will define the methodologies for agentic reasoning observability , develop automated evaluation (autoeval) systems , and design simulators to stress-test AI performance before it reaches the customer. You will partner closely with Product, Engineering and Data Science to translate complex agent behaviors into micrometrics -the granular signals that help us pinpoint exactly where a reasoning chain succeeded or failed. The ideal candidate brings a systems thinking mindset, technical literacy in LLM orchestration, and the ability to influence technical roadmaps through rigorous data and observability frameworks. What You'll Do Architect Observability Frameworks:
Own the strategy for understanding AI agentic reasoning, enabling deep analysis of step-by-step agent decision-making. Drive Autoeval Strategy:
Design and roll out automated evaluation systems (LLM-as-a-judge) to provide a scalable, high-confidence "pulse" on AI performance across conversational and voice interfaces. Define Micrometrics:
Develop granular signals within agentic activity-identifying latent failures, reasoning loops, or tool-calling inefficiencies-to drive product improvements Lead Pre-Launch Simulation:
Partner with Product & Engineering to build and maintain simulation environments that test AI agents against edge cases before deployment, and democratise these tools with Operations teams Cross-Functional Technical Partnership:
Act as the primary liaison between Product, Engineering, and Data Science to ensure observability tooling is integrated into the development lifecycle and directly informs release "Go/No-Go" decisions. Insight Synthesis:
Package complex technical observability data into clear, actionable narratives for leadership, highlighting specific failure patterns and opportunities for CX improvement. Operational Excellence:
Establish the standards and tooling for how AI performance is reported globally, ensuring consistency across different regions and support modalities. Basic Qualifications 5+ years of experience in Technical Program Management, Product Operations, AI Quality, or Observability Bachelor's degree in Engineering, Computer Science, Data Science, or a related technical field. Preferred Qualifications AI Literacy:
Deep understanding of GenAI systems, including LLM orchestration, agentic workflows, and the nuances of reasoning chains (e.g., Chain of Thought). Systems Thinking:
Proven experience designing technical frameworks or evaluation pipelines (e.g., autoevals, RAG evaluation, or model benchmarking). Analytical Rigor:
Ability to define and track complex technical metrics (micrometrics) and correlate them with high-level business KPIs. Influence without Authority:
Demonstrated ability to drive complex initiatives in an IC capacity by building strong partnerships with Engineering and Product teams. Advanced AI Expertise:
Experience with "LLM-as-a-judge" frameworks, prompt engineering for evaluations, and fine-tuning feedback loops. Simulation & Testing:
Background in building simulators, "digital twins," or robust A/B testing frameworks for conversational AI or autonomous agents. Tooling Proficiency:
Familiarity with AI observability tools Problem Solving:
Exceptional ability to turn "noisy" AI logs into structured failure pattern analysis. Communication:
Strong ability to translate highly technical agent behaviors into business-relevant insights for non-technical stakeholders. Domain Knowledge:
Experience in Customer Support technology, Voice UX, or high-volume automated workflows. For Sunnyvale, CA-based roles: The base salary range for this role is USD$162,000 per year - USD$180,000 per year. You will be eligible to participate in Uber's bonus program, and may be offered an equity award & other types of comp. All full-time employees are eligible to participate in a 401(k) plan. You will also be eligible for various benefits. More details can be found at the following link https://jobs.uber.com/en/benefits.