Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Software Developer/Engineer Work locations: Philadelphia |
Work Mode:
Hybrid, minimum 3 days in the office
Duration:
12 Months About the Role Consultant Requirements - On-Prem LLM & Vector DB Implementation Core Experience Hands-on experience deploying open-source LLMs such as Meta Llama 3 and Mistral / Mixtral in on-prem or private environments Strong proficiency in Python for LLM inference, prompt engineering, and integration Experience with CPU-based inference, model quantization, and performance tuning Vector Databases & RAG Practical experience with open-source vector databases such as Qdrant, Chroma, Milvus, or pgvector Proven implementation of Retrieval-Augmented Generation (RAG) pipelines Experience generating and managing embeddings and metadata filtering Security & Governance Understanding of data privacy, air-gapped deployments, and enterprise security requirements Experience implementing access controls and audit logging Nice to Have Experience with LangChain or LlamaIndex Exposure to Rust, Go, or C++ for high-performance services Familiarity with Docker and Kubernetes for on-prem deployments Knowledge of inference frameworks (e.g., vLLM, llama.cpp, Hugging Face Transformers) Prior work in regulated or enterprise environments Deliverables Reference architecture and deployment guidance Working prototype (LLM + vector
DB + RAG
) Documentation and knowledge transfer to internal teams