Position:
Generative AI Engineer Overall Experience:
8+ Years (with strong Python expertise)
GenAI Experience:
2+ Years (in Production Environments, not just POCs)
Locations:
Charlotte, NC |
New Jersey, NJ Work Mode:
Hybrid (3 days onsite per week)
Employment Type:
Full-Time /
Contract-to-Hire Interview Process:
Includes a mandatory, live hands-on coding round Key Responsibilities
- GenAI Solution Engineering & Advanced RAG
Orchestration:
Design and build production GenAI applications using LangChain and LangGraph for multi-agent, stateful, and graph-based workflows.
RAG Optimization:
Develop and optimize RAG pipelines including advanced patterns like HyDE, re-ranking, hybrid search, multi-hop retrieval, and RAPTOR hierarchical summarization.
API Development:
Build and expose GenAI capabilities as RESTful and streaming APIs using FastAPI (with async support, dependency injection, and OpenAPI documentation).
- MCP Server Development & LLMOps
Context Architecture:
Architect and maintain Model Context Protocol (MCP) servers to securely connect LLMs to heterogeneous enterprise data sources (SQL, NoSQL, APIs).
Observability:
Integrate systems with frameworks like LangSmith, Helicone, Arize, or OpenTelemetry for tracing, latency profiling, and prompt lineage.
Guardrails & Monitoring:
Own prompt versioning, model evaluation (RAGAS, ROUGE, BERTScore), and implement guardrails (Guardrails AI, NeMo Guardrails) for PII redaction and toxicity filtering.
Full-Stack Integration & Governance Angular Frontend:
Develop Angular-based user interfaces (chat UIs, agent monitors, dashboards) and consume FastAPI streaming endpoints (SSE / WebSockets) for real-time token streaming.
Platform Governance:
Contribute to architectural decisions around model routing, semantic caching (Redis), and multi-tenant isolation while ensuring compliance with enterprise data governance. Required Qualifications
Experience:
8+ years of total software engineering experience; 2+ years of hands-on, production-level Generative AI experience.
Core GenAI Stack:
LangChain, LangGraph, LLM APIs (OpenAI, Anthropic, Azure, Bedrock), and Vector Stores (Chroma, Pinecone, Weaviate, pgvector).
Backend & Protocol:
Python 3.10+ (async/await, Pydantic v2), FastAPI, and hands-on experience building/deploying MCP servers or equivalent context-injection frameworks.
Frontend:
Angular 15+ (components, services, RxJS, and signals) to successfully bridge backend AI services with the user interface.
Data & Infrastructure:
SQL + NoSQL (PostgreSQL, MongoDB, Redis), Docker, Kubernetes, and GitHub Actions CI/CD.
- Preferred (Plus) Skills
- Experience with graph-based architectures for complex, cyclic reasoning tasks.
- Familiarity with fine-tuning workflows (LoRA/QLoRA, PEFT, DPO/RLHF) and distributed inference (vLLM, TGI, Triton).
- Prior experience in regulated industries (Banking, FinTech, Healthcare) with awareness of model risk management frameworks (e.
g., SR 11-7).