Job Description
Founding Technical Staff Tessel Sunnyvale, CA Job Details Full-time $150,000 - $220,000 a year 2 hours ago Qualifications AI models Tooling AI tools proficiency Math APIs Machine learning (ML) fundamentals Python Full Job Description About Tessel Tessel is the evidence infrastructure for safety-critical AI — the system that proves a model works, and keeps proving it. The hard part in safety-critical sectors isn't building models, it's proving to stakeholders that they are safe and effective. Today, regulators, buyers, and insurers increasingly demand evidence about AI behavior across approval, procurement, and reimbursement. AI vendors are chasing a moving target: there are no settled evidence requirements, and each stakeholder defines safety differently. As a result, important model failures often remain poorly understood until external review or real-world deployment forces them to the surface. Tessel surfaces AI failures in safety-critical sectors before they cost companies regulatory approval and customer trust. We align AI development with the outcomes that matter for the business, providing the infrastructure and methodology to continuously investigate how a model behaves and generate the evidence about AI behavior needed to confidently navigate approval, procurement, and reimbursement. Role Overview You'll work directly with diagnostic imaging AI vendors preparing 510(k) or De Novo submissions, and with academic medical centers building and deploying their own under LDT pathways, running evidence investigations using an AI-native workflow to investigate model behavior. This involves rigorously analyzing data, probing the models' internal representations, and producing claims about model behavior backed by structured evidence. You'll iterate on the platform itself — rethinking the evidence standards and methodology we impose, such as safety case abstractions we instantiate. You'll critique the current platform, write methodology proposals, and write Python platform code yourself. When frontend features need user testing before we commit to building them in production, you'll prototype them with AI tooling. We're hiring a Technical Staff (the in-fashion term for an ML engineer), but it's really a founding member of the team. We're looking for someone who can challenge our assumptions and work with us beyond the technical to build what we think is foundational infrastructure for the future of ML, starting with delivering real value to our customers.
Key Responsibilities Own engagements end-to-end:
run the customer meetings, scope the real question, investigate model behavior (data analysis, probing internal representations, building safety cases), and deliver findings you stand behind, including telling customers when a model isn't ready. Improve the platform:
write methodology proposals on evidence standards and safety case abstractions, ship Python platform code, and prototype frontend features with AI tooling for user testing. Shape how engagements run as the team grows: cadences, standards, onboarding. Skills & Experience Required:
Strong data science and ML fundamentals. You understand the math behind the methods you use, not just the APIs, and you've applied them in real systems with messy data and unclear questions. Python fluency end-to-end, from raw data through analysis to platform code others will use. You make informed engineering decisions based on trade-offs: how to structure code, where to draw interface boundaries, when reusability is worth the cost. If not yet fully there, strong willingness to make progress on this. Adapts technical communication to different audiences without losing precision, and holds rigorous positions under commercial pressure. Comfortable telling customers their model isn't ready when it isn't. Fluent with AI-assisted workflows. Comfortable using AI tooling as a primary mode of working, not just an occasional helper. Nice-to-Have:
Top-tier degree in CS with an AI track, ML, math, physics, engineering, or related, plus 3-5 years of professional ML experience or a PhD in a relevant area such as model evaluation, robustness, OOD detection, or interpretability. Reviewed AI/ML medical devices at the FDA, or equivalent regulatory experience. Strong background in safety case methodology from aviation, automotive, or another safety-critical field. Performance Expectations By month 3: leading 2 engagements independently, with findings the team would defend externally. By month 6: running 4 concurrent engagements, and having contributed to and pushed back on platform features and evidence methodology in meaningful ways. By month 12: setting internal standards across engagement methodology, evidence quality, and onboarding and coaching for new hires. We'll refine these milestones together in your first weeks, with 1:1s and quarterly formal reviews. Compensation Range:
$150K - $220K