Senior Engineer - Data, Schema & Knowledge Systems
Job
IBM
Austin, TX (In Person)
Full-Time
Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
57
out of 100
Average of individual scores
Skill Insights
Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
- Introduction
- At IBM Software, we transform client challenges into solutions.
- Your role and responsibilities
- We are seeking a Senior Software Engineer to own and evolve core platform systems spanning knowledge ingestion, memory architecture, evaluation infrastructure, and gateway data management.
What You'll OwnKnowledge Base & Memory (Rust)
- Design and evolve the data schema and ingestion pipeline supporting large-scale documentation corpora, including document extraction, segmentation, and hybrid search (BM25 + vector).
- Improve corpus quality through deduplication, relevance tuning, quality scoring, source-of-truth tracking, and versioned corpus management.
- Own the memory architecture across working, semantic, and observational memory tiers, designing retrieval that is context-aware and budget-conscious.
- Evolve federated search capabilities, including multi-KB querying, relevance tuning, embedding model selection, and quality metrics.
- Build and scale an evaluation curation system for an LLM-as-judge framework, including versioned eval datasets, regression baselines, and authoring tooling.
- Design and implement a schema-driven entity registry with YAML-defined schemas, enabling new infrastructure connectors without code changes.
- Own declarative state machine configuration decoupled from hardcoded logic.
- Design a domain-agnostic evidence model to support audit and compliance requirements (e.g., PCI-DSS, SOX).
- Formalize metadata and provenance tracking across entities, including import/export and multi-connector support.
- Extend evaluation frameworks for end-to-end coverage across composable pipelines.
- Design eval schemas, dataset management tooling, and regression thresholds.
- Partner with other teams on shared benchmarks, test corpora, and multi-model evaluation strategy.
- Track and report model quality metrics to support production deployment decisions.
Month 2: Ship the entity registry refactor — dynamic entity registration with YAML schema definitions. Design the eval curation system — dataset versioning, case authoring tooling, regression baseline management. Begin expanding eval corpus coverage.
Month 3: Ship the evidence model schema. Implement eval curation tooling. Begin knowledge base quality improvements — deduplication, source-of-truth tracking, relevance tuning. Establish eval quality dashboard with cross-model comparison.
- Required technical and professional expertise
- Data modeling instincts. You think naturally about schemas, entity relationships, state machines, and how data evolves over time. You've designed data models that other engineers build against.
- Information retrieval or search experience. You've worked with search indexing, document processing, corpus management, or similar — you understand how to make unstructured data findable and useful.
- You can ship across languages. This role works in both Rust and Go. You don't need to be an expert in both, but you need to be productive in at least one and willing to learn the other.
- Quality measurement mindset. You've built or worked with evaluation systems, quality metrics, regression detection, or A/B testing infrastructure. You understand how to measure whether something is getting better or worse.
- Preferred technical and professional experience
- You don't need all of these coming in. The team will bring you up to speed:
- IBM Z domain knowledge — the documentation sets, infrastructure concepts, and operational patterns that the knowledge base serves
- LLM evaluation methodology — rubric-based scoring, LLM-as-judge patterns, baseline regression, multi-model comparison
- Our knowledge base ingestion pipeline (document extraction, chunking, vector + full-text indexing)IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer.
Similar jobs in Austin, TX
Cloud Imperium Games
Austin, TX
Posted1 day ago
Updated6 hours ago
Similar jobs in Texas
UnitedHealth Group
Houston, TX
Posted1 day ago
Updated6 hours ago
U.S. Customs and Border Protection
San Benito, TX
Posted1 day ago
Updated6 hours ago