Staff ML Ops Engineer Position Available In Hillsborough, Florida
Tallo's Job Summary: Ybor.ai is hiring a Staff ML Ops Engineer to lead the MLOps platform, design scalable ML pipelines, optimize ML inference systems, collaborate with cross-functional teams, automate deployment of ML models, and ensure compliance and security in ML workflows. The role requires expertise in ML engineering, cloud platforms, CI/CD, and programming in Python. Degree in Computer Science with 5+ years of relevant industry experience is required. Join Ybor.ai to shape the future of ML inference at scale!
Job Description
We are seeking a seasoned Staff ML Ops Engineer to architect, build, and optimize an ML inference platform including agentic AI workflows. This role requires deep expertise in Machine Learning engineering and infrastructure, with a primary focus on building and scaling ML inference systems in production. The ideal candidate will have proven experience in designing scalable, reliable ML pipelines and working independently on complex challenges with innovative solutions, with a mindset for automation. Responsibilities Lead the MLOps platform, ensuring scalable, reliable, and efficient model workflows in production. Develop and optimize ML pipelines to support model performance and scalability across the organization. Design and implement high-performance inference systems for a variety of models, Agentic AI workflows with purpose driven LLMs integration. Collaborate with cross-functional teams (Data Science, Software Engineering, and Product) to align MLOps strategies with business goals. Provide technical leadership and mentorship, guiding best practices for MLOps and DevOps. Automate and streamline deployment of ML models to production, ensuring minimal downtime and robust versioning. Develop and integrate CI/CD workflows for ML systems. Implement monitoring tools to track model and system performance. Ensure compliance, security, and governance in ML workflows and deployments. Analyze cost-performance trade-offs to optimize resource allocation and system efficiency. Communicate machine learning engineering strategies effectively across different management levels and stakeholders. Requirements Proven experience in designing and implementing scalable ML inference systems. Strong background in ML frameworks (TensorFlow, PyTorch, Scikit-learn) and distributed computing. Expertise in cloud platforms (Azure, GCP or AWS) and containerization (Docker, Kubernetes). Strong CI/CD experience (GitHub Actions, ArgoCD). Hands-on experience with one or more model deployment technologies (MLflow, Kubeflow, Seldon, SageMaker). Advanced programming skills in Python; experience in Java, Scala, or Go is a plus. Experience with database systems (SQL, NoSQL) and big data frameworks (Spark, Hadoop) is a plus. Strong grasp of ML Ops capabilities including Data Versioning, Feature Store, Model Monitoring, and Experiment Tracking. Experience with model optimization techniques such as distillation, quantization, and hardware acceleration. Familiarity with governance frameworks for responsible AI is preferred. Strong problem-solving skills and the ability to work independently in a remote setting. Qualifications Degree in Computer Science with 5+ years of relevant industry experience, specialized in Machine Learning. Advanced certifications in cloud computing, machine learning, or DevOps is a big plus. About Ybor.ai At Ybor.ai, we are at the forefront of building enterprise solutions infusing AI that drive real-world impact. Our multi-cloud platform serves to provision, compute, connect data, infuse AI and rapidly deploy enterprise workloads to any cloud. We foster a culture of innovation, collaboration, and technical excellence, empowering our engineers to push the boundaries of ML Ops and AI infrastructure. Join us to lead and shape the future of ML inference at scale! #MLOps #MachineLearning #AI #DeepLearning #MLInfrastructure #Hiring #TechJobs #RemoteJobs