Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Apply Offsite

Senior Data Engineer

Job

Mindlance

North Chicago, IL (In Person)

Full-Time

Posted 6 weeks ago (Updated 1 week ago) • Actively hiring

Expires 6/23/2026

See Job Scorecard

Review key factors to help you decide if the role fits your goals.

How is this calculated?

Pay Growth

out of 5

Not enough data

Not enough info to score pay or growth

Job Security

out of 5

Not enough data

Calculating job security score...

Total Score

out of 100

Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

Senior Data Engineer#26-11294 North Chicago, IL

All On-site Job Description Job Description:

We are looking for a Software Development Engineer to build and scale an AI-powered document parsing platform that extracts structured data from complex PDFs (pharmaceutical batch records, certificates, regulatory documents) using OCR, LLMs, and RAG. You will work across the full stack — backend AI pipelines, frontend chat interface, and cloud infrastructure. Roles & Responsibilities Design and develop production-grade RAG (Retrieval-Augmented Generation) pipelines for domain-specific document querying with hybrid search, reranking, and multi-agent answer synthesis Build and optimize document processing pipelines using AWS Textract for OCR extraction from tables, handwritten content, and structured forms Integrate and orchestrate multiple LLM models (Claude, Gemini) for intent classification, data extraction, validation, and conversational AI Develop and maintain the FastAPI backend — REST APIs, streaming endpoints (SSE), authentication, and background task processing Build responsive frontend features using .js, React, and TypeScript — chat interface, PDF viewer with highlights, real-time progress tracking Manage cloud infrastructure on AWS — EC2 deployment, S3 storage, RDS (PostgreSQL), and IAM configuration Work with vector databases (Weaviate) and graph databases (Neo4j) for semantic search and structural document querying Implement chunking strategies, embedding generation, cross-encoder reranking, and semantic caching for accurate document retrieval Deploy and monitor AI models and services in production — model fallback chains, retry mechanisms, error handling Write clean, maintainable code with proper logging, error handling, and documentation Required Skills Python (FastAPI, async programming, pandas) TypeScript / React (.js) RAG systems — vector search, embeddings, chunking, reranking (production-grade) LLM integration — prompt engineering, structured output, multi-model orchestration AWS — EC2, S3, Textract, RDS PostgreSQL REST API design with streaming (SSE) Git, basic CI/CD, Linux server management Good to Have Weaviate, Neo4j, or similar vector/graph databases Gemini Vision or GPT-4V for document image analysis LangChain / LangGraph Docke, nginx Pharmaceutical/regulated document experience

Experience:

3-6 years