Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Apply Offsite

AI Infrastructure Data Engineer || Remore role

Job

Verito Solutions

Remote

Full-Time

Posted 3 days ago (Updated 10 hours ago) • Actively hiring

Expires 7/4/2026

See Job Scorecard

Review key factors to help you decide if the role fits your goals.

How is this calculated?

Pay Growth

out of 5

Not enough data

Not enough info to score pay or growth

Job Security

out of 5

Not enough data

Calculating job security score...

Total Score

100

out of 100

Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

AI Infrastructure Data Engineer Remore role Long Term Contract References must needed

Job Description:

Build the data backbone that powers AI

pipelines, knowledge bases, ingestion, and retrieval infrastructure. Minneapolis (Hybrid)
Intermediate / Senior
4-8 YOE
Data pipelines required AI systems are only as good as the data feeding them.

This role owns the infrastructure that gets data from internal systems, document stores, APIs, and enterprise databases into vector indexes, knowledge bases, and structured stores that AI agents can reliably query. You'll build ingestion pipelines with freshness management, design chunking and embedding strategies, and ensure retrieval quality

the hidden layer that determines whether agents give accurate answers or hallucinate.

This is not a traditional data warehousing role; it is data engineering specifically in service of AI systems.

WHAT YOU'LL BUILD

▸ Ingestion pipelines pulling from internal systems, APIs, document repositories, and enterprise databases into AI knowledge stores ▸ Vector indexing infrastructure

embedding model selection, chunking strategies, metadata enrichment, hybrid index design ▸ Freshness and change detection
incremental re-indexing, stale data detection, TTL management ▸

ETL / ELT

pipelines for structured data feeding AI decision and retrieval layers ▸ High-throughput event-driven ingestion for real

time and batch processing at enterprise scale ▸ Data quality validation
schema checks, completeness scoring, anomaly detection before indexing

REQUIRED EXPERIENCE

▸ 4+ years building production data pipelines

orchestrated workflows, not one-off scripts ▸ Strong SQL
query optimization, indexing, execution plans, large result sets ▸ Experience with vector databases or search infrastructure (OpenSearch, Pinecone, pgvector, Azure AI Search) ▸ Python data processing at scale
Pandas, Polars, or equivalent ▸ Understands embedding models
how to evaluate retrieval quality, why chunking strategy matters ▸ Cloud data stack
AWS (Glue, S3, RDS) or Azure equivalent ▸ Can diagnose why a RAG system's retrieval is failing
at the data layer

NICE TO HAVE

▸ Event streaming platforms

event-driven pipeline design, high-throughput ingestion patterns ▸ Legacy enterprise RDBMS experience (DB2, Oracle, or equivalent) ▸ Document intelligence
OCR pipelines, PDF/scanned document ingestion ▸ dbt, Airflow, or similar pipeline orchestration tooling ▸ Knowledge graph experience
Neo4j, Amazon Neptune, RDF/SPARQL, ontology design ▸ Experience building knowledge bases specifically for LLM consumption
not just generic warehousing ▸ Financial services data
understanding of regulated data handling, PII, audit trails

TECH STACK

Python

Pandas / Polars / PySpark

ETL / ELT

Pipelines Event Streaming Pipelines Vector Databases (pgvector

Pinecone
Weaviate) OpenSearch
Hybrid Search Knowledge Graphs
Graph Databases Neo4j
Amazon Neptune RDF
SPARQL
Ontology Design Embedding Models
Chunking Strategies Document Intelligence
OCR Pipelines dbt
Airflow
Pipeline Orchestration Cloud Data Services (AWS / Azure) Relational Databases
SQL Optimization Data Quality
Schema Validation Docker
Container Orchestration Enterprise API Integration
( Believe you can and you re halfway there. )
Theodore Roosevelt Sayantan Das |

Senior Tech Recruiter E:

P:

+1 |