Principal Technical Program Manager - Applied Science
Job
Microsoft
Redmond, WA (In Person)
$207,350 Salary, Full-Time
Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
100
out of 100
Average of individual scores
Skill Insights
Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Principal Technical Program Manager
The gap between benchmark scores and real developerexperienceis growing, making it hard tounderstand which problems are truly 'solved',and which are worthdeeper investment. GitHub is uniquely positioned to lead the industry through this transition. We have direct feedback and deep insight into real production workflows from millions of developers, and the scale to build evaluation systems that truly reflect developer success. We're looking for a Principal Technical Program Manager to help us build the future of AI evaluation. The Applied Science teamforGitHub Copilot sits at the intersection of frontier AI research and the world's largest developer platform. We ship AI-powered experiences(ex:code completion, code review,coding agents)used by millions of professional developers every day.
As a member of the team, you willhelp leadGitHub Copilot's AI evaluation strategy end-to-end-from benchmark design and lifecycle governance, through evaluation infrastructure and internal adoption, to community engagement and public transparency. You are the person who ensures that everymodelswap,product harness,and feature launch is measured against whatactually mattersto developers
- Applied Science Microsoft $139,900.00
- $274,800.
The gap between benchmark scores and real developerexperienceis growing, making it hard tounderstand which problems are truly 'solved',and which are worthdeeper investment. GitHub is uniquely positioned to lead the industry through this transition. We have direct feedback and deep insight into real production workflows from millions of developers, and the scale to build evaluation systems that truly reflect developer success. We're looking for a Principal Technical Program Manager to help us build the future of AI evaluation. The Applied Science teamforGitHub Copilot sits at the intersection of frontier AI research and the world's largest developer platform. We ship AI-powered experiences(ex:code completion, code review,coding agents)used by millions of professional developers every day.
As a member of the team, you willhelp leadGitHub Copilot's AI evaluation strategy end-to-end-from benchmark design and lifecycle governance, through evaluation infrastructure and internal adoption, to community engagement and public transparency. You are the person who ensures that everymodelswap,product harness,and feature launch is measured against whatactually mattersto developers
- and that the world can see the results. Responsibilities In this role you'll: Partner with Applied Science researchers to translatecutting-edgeevaluation research into production systems: adaptive testing (IRT), agent-centric co-evolution, adversarial benchmarking, and telemetry-driven benchmark generation. Lead the deprecation of saturated benchmarks and design their next-generation replacements
- including procedurally-generated code evaluations thatcan'tbe memorized and adaptive testing systems that skip trivial questions for frontier models. Build GitHub's community benchmark submission program
- enabling external researchers, enterprises, and open-source developers to contribute domain-specific evaluations
- and publish GitHub's first external benchmark transparency reports showing how models perform on real developer workflows. Design and operationalize multi-tier evaluation frameworks
- from fast automated regression suites and LLM-as-judge systems, through expert human evaluation, to production A/B testing
- so teams can iterate in hours, not weeks. Design feedback-to-benchmark pipelines that convert thumbs-down signals, user frustrations, and support tickets into candidate regression tests
- systematizing informal practices into scalable, automated systems. Establish evaluation as a first-class discipline across GitHub Copilot
- creating the rituals, dashboards, and communication cadences that make evaluation results accessible and actionable for every team.
Microsoft Cloud Background Check:
This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years Preferred Qualifications 5+ years of experience in technical program management, product management, applied science, or equivalent 2+ years managing programs in machine learning, AI/ML evaluation, or data science 2+ years managing cross-functional and/or cross-team projects Deep, firsthand experience with AI/ML evaluation methodologies: benchmark design and validity, human evaluation frameworks, automated scoring systems (including LLM-as-judge), A/B testing, and statistical significance. Deep personal experience with AI coding tools- you use Copilot, Cursor,Claude Code,or similartoolsdaily and have strong opinions about what "good" looks like from a developer's perspective. Understanding ofsoftware engineering workflows at scale
- code review, CI/CD, testing, debugging, refactoring
- and how AI tools should integrate into each. Experience with community or open-source program management
- contributor programs, external research partnerships, or developer relations in a technical context. Proven ability to navigate competing priorities across teams and build shared commitment tocommongoals in ambiguous, fast-moving environments. Track recordof building evaluation systems that directlyinfluencedproduct or model shipping decisions at scale. Technical Program Management IC5
- The typical base pay range for this role across the U.S. is USD $139,900
- $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and York City metropolitan area, and the base pay range for this role in those locations is USD $188,000
- $304,200 per year.
Similar remote jobs
Southern Company
Durham, NC
Posted2 days ago
Updated19 hours ago
Commonwealth of PA
Pennsylvania
Posted2 days ago
Updated19 hours ago
Memorial Sloan Kettering Cancer Center
New York, NY
Posted2 days ago
Updated19 hours ago
University of Minnesota
Saint Paul, MN
Posted2 days ago
Updated19 hours ago
Similar jobs in Redmond, WA
Amazon.com, Inc.
Redmond, WA
Posted2 days ago
Updated19 hours ago
SpaceX
Redmond, WA
Posted2 days ago
Updated19 hours ago
Similar jobs in Washington
Public Storage
Edmonds, WA
Posted2 days ago
Updated19 hours ago
Goodwill Industries of the Columbia
Kennewick, WA
Posted2 days ago
Updated19 hours ago