Job Description
MLOps Engineer Strategic Healthcare Programs - 3.7 Goleta, CA Job Details $140,000 - $175,000 a year 4 hours ago Benefits Employee stock purchase plan Paid holidays Health insurance 401(k) matching Qualifications Software engineering Engineering development testing APIs Unit testing Multithreading Design patterns Full Job Description Strategic Healthcare Programs (SHP) is a leading provider of analytics and performance management solutions for the post-acute healthcare market. We are an industry leader in helping Home Health, Hospice, and Skilled Nursing providers improve their financial and quality performance while complying with many regulatory requirements. Additionally, we connect the post-acute world to the broader provider markets to allow for optimal management across the continuum of care. Role Overview We're hiring a strong Python engineer to build and operate our production ML platform end-to-end. You'll productionalize data science work by building robust on-premises infrastructure, establishing software engineering best practices, and creating the tooling that enables our data scientists to ship faster. All infrastructure is self-hosted. This is a remote or hybrid position within the United States. Employees living within 75 miles of the Santa Barbara office are required to work in-person in the office every Wednesday. ML experience is welcome but not required. We care most about your software engineering foundation: production Python, OOP, testing, and async/parallel performance. Our existing ML engineers will get you up to speed on the ML side — frameworks, LLMs, vector stores, vLLM, and the rest.
Team:
You'll join a tight ML team where every engineer owns meaningful surface area. We're a small team where every engineer owns their code end-to-end. We value people who deeply understand the systems they build — not just that they run. What You'll Do Day-to-Day Production ML Systems (40%) Build automated ML pipelines: data ingestion training evaluation deployment retraining Deploy and serve models (batch + real-time) via FastAPI/Flask APIs with auto-scaling and rollback Implement CI/CD for ML:
model packaging, versioning, automated deployments Optimize workflows using async, parallelism, Ray, and Dask ML Platform & Tooling (35%) Design reusable internal Python packages for preprocessing, training, inference, and evaluation Refactor data science notebooks into maintainable OOP modules Build workflow orchestration for training and inference pipelines Create standardized templates for model development Observability & Reliability (15%) Monitor latency, drift, data quality, and model performance Build alerting for degradation and anomalies (Prometheus, Grafana) Create dashboards for production model health Set up automated retraining triggers Code Quality & Collaboration (10%) Coach data scientists on production-grade Python:
testing, OOP, async/parallel patterns Establish and enforce software best practices across the ML codebase Partner with data scientists to translate pain points into engineering solutions Required Skills Must Have:
5+ years of production Python engineering Strong OOP fundamentals: classes, inheritance, composition, design patterns Testing discipline: unit, integration, fixtures, mocking Demonstrated async and parallel optimization (asyncio, multiprocessing, threading) Building and operating production Python services (APIs, workers, background jobs) Familiarity with FastAPI or Flask Experience deploying to self-hosted/on-prem environments Soft Skills:
Translate engineering needs into clean, maintainable code Comfortable coaching peers on production engineering practices Curious about ML and motivated to ramp into it Nice-to-Have Prior MLOps or ML platform experience ML frameworks: scikit-learn, XGBoost, PyTorch Observability stack: Prometheus, Grafana, structured logging/tracing RAG pipelines: vector stores, semantic search LLM serving: vLLM, Text Generation Inference GenAI/agentic frameworks: LangChain, LlamaIndex, DSPy Orchestration:
Prefect, Kubeflow, Airflow, or similar Kubernetes and containerization in on-prem environments Experiment tracking: MLflow LLM observability: Phoenix, Langfuse, OpenLIT On-prem GPU infrastructure management Pay $140,000. - $175,000. annual, depending upon experience. Benefits We value work/life balance. We offer comprehensive health benefits, a 401(k) plan with a company match, an employee stock purchase plan, vacation time, sick time, and paid holidays. This position is not eligible for immigration sponsorship. Experience Required 5+ years of production Python engineering Strong OOP fundamentals: classes, inheritance, composition, design patterns Testing discipline: unit, integration, fixtures, mocking Demonstrated async and parallel optimization (asyncio, multiprocessing, threading) Building and operating production Python services (APIs, workers, background jobs) Familiarity with FastAPI or Flask Experience deploying to self-hosted/on-prem environments Equal Opportunity Employer/Protected Veterans/Individuals with Disabilities This employer is required to notify all applicants of their rights pursuant to federal employment laws. For further information, please review the Know Your Rights (https://www.eeoc.gov/poster) notice from the Department of Labor.