Skip to main content
Tallo logoTallo logo
Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

AI Researcher - Reinforcement Learning

Job

1X

San Carlos, CA (In Person)

$250,000 Salary, Full-Time

Posted 1 week ago (Updated 1 day ago) • Actively hiring

Expires 7/23/2026

Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
100
out of 100
Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

AI Researcher
  • Reinforcement Learning 1X San Carlos, CA Job Details Full-time $200,000
  • $300,000 a year 22 hours ago Benefits Commuter assistance Health savings account Paid holidays Disability insurance Health insurance Dental insurance Flexible spending account Paid time off Parental leave Employee assistance program Vision insurance 401(k) matching Life insurance Qualifications Simulated training environment Reinforcement learning PyTorch AI platforms (beyond public GPTs) Robot simulation Full Job Description About 1X We're building humanoid robots that work in home•doing the chores, handling the tasks, and giving people their time back.
Simple, but it's not. To do this right, we have to solve robotics, AI, manufacturing
  • at the same time, at scale, in a form factor that has to be safe enough to live with your family. If you're inspired by this, you'll thrive here. We've been at this since 2014 and we're at the point where the hard problems are behind us and the hard work is in front of us. NEO is our flagship
  • a home robot designed to move, learn, and operate in the real world alongside real people. We're not demoing it
  • we're shipping it. We're excited to meet you, if this excites you. If you've spent your career working on problems that matter and want to see them actually reach the world
  • this is that moment. We're scaling, we're hiring with intention, and we need people who want to build something that will genuinely change how humans spend their time
  • safely creating abundance for all.
About the Team The Reinforcement Learning team teaches NEO new capabilities, training policies for manipulation and locomotion tasks across simulation and real-world environments, then deploying them into homes. We work at the intersection of algorithm development, sim-to-real transfer, and production deployment: our research is only successful when a policy runs reliably on a physical robot in the field. If you want to directly expand what a humanoid robot can do for people, this is that team. Your Charter Own the full pipeline from RL algorithm development through production deployment—training NEO on manipulation and locomotion tasks in simulation, closing the sim-to-real gap, and shipping policies that work reliably in real-world home environments.
This is critical-path work:
the range of tasks NEO can perform safely and reliably is a direct function of the quality of RL policies your team ships. You will collaborate closely with hardware, controls, data collection, and QA teams, and measure your impact by what NEO can do in the field. Key Outcomes Train and deploy RL policies for manipulation and locomotion tasks that perform reliably in real-world home environments measured by field task success rates, not just simulation benchmarks Advance sim-to-real transfer techniques that measurably narrow the gap between simulation training performance and real-world policy behavior, enabling faster iteration cycles Build training and evaluation infrastructure that lets the team iterate on policies faster with standardized benchmarks, automated regression detection, and clear connections between training metrics and field performance Partner with hardware, controls, data, and QA teams to ship RL-trained skills to production customer sites, owning the handoff from research to deployment Key Competencies Sim-to-real practitioner closing the sim-to-real gap on physical systems; understands domain randomization, reward shaping, and the engineering required to make simulated policies transfer reliably to real hardware RL algorithms depth with strong foundation in RL algorithms (PPO, SAC, TD-MPC, or similar); can choose the right approach for the task and modify or extend it when standard methods fall short Full-stack ownership owning data engineering, model architecture, and deployment; treats a promising training curve as the beginning of the job, not the end Effective cross-functional partner working closely with hardware, controls, QA, and data teams to translate RL research into deployed robot skills, and communicates technical constraints clearly across disciplines Minimum Requirements Strong Python and/or C++ with experience in large codebases and build tools (Bazel or equivalent) Proficiency with PyTorch for RL policy training and experimentation Hands-on experience with simulation platforms (Isaac Sim, MuJoCo, or equivalent) for policy training at scale Demonstrated experience training RL policies for manipulation or locomotion tasks, including addressing the sim-to-real gap on physical hardware Preferred Skills Experience with model-based RL or world-model-guided policy learning that leverages predictive models to improve sample efficiency Familiarity with imitation learning or learning from demonstration (behavior cloning, GAIL, IQL) as a complement or bootstrap to RL Experience deploying RL-trained policies to physical robots in production environments, including monitoring, failure analysis, and iterative improvement Background in legged locomotion, dexterous manipulation, or contact-rich control for physical systems Compensation Range $200,000
  • $300,000 + Equity Benefits Comprehensive medical, dental, and vision coverage Generous paid time off, company holidays, and parental leave 401(k) plan with company match (100% on the first 3% of contributions, 50% on the next 2%) Flexible Spending Accounts (FSA) and Health Savings Accounts (HSA) options Commuter benefits (transit and parking) Short-term and long-term disability, and life insurance Employee Assistance Program (EAP) for mental health, financial, and personal support Onsite snacks and catered lunches Equal Opportunity Employer 1X is an Equal Opportunity Employer.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, ancestry, citizenship, age, marital status, medical condition, genetic information, disability, military or veteran status, justice system impact, or any other characteristic protected under applicable federal, state, or local law.
Compensation Range:
$200K
  • $300K