DevOps/SRE Engineer
Job
Robert Half
Minneapolis, MN (In Person)
Full-Time
Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
79
out of 100
Average of individual scores
Skill Insights
Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Description We are looking for a DevOps/SRE Engineer to support the reliable delivery and day-to-day operation of AI-driven capabilities within customer-facing products in Minneapolis, Minnesota. This position is focused on production readiness, service stability, and smooth integration across platforms rather than building the underlying AI features. The ideal candidate will help ensure new functionality is introduced safely, monitored effectively, and maintained with a strong emphasis on performance, cost awareness, and customer impact.
Responsibilities:
- Lead the release of AI-enabled product capabilities from pre-production validation through live deployment, ensuring launches are controlled and dependable.
- Oversee production health by tracking availability, response times, failures, and service quality, and take prompt action when issues affect performance.
- Maintain and improve connections between external AI providers, internal model services, and customer-facing applications to support reliable functionality.
- Administer API credentials, usage thresholds, vendor quotas, and spend controls, proactively identifying risks related to capacity or budget.
- Create and refine operational dashboards, alerting rules, and response documentation to strengthen support for AI-related incidents.
- Work closely with product and engineering partners to plan staged rollouts, feature gating, rollback paths, and low-risk release strategies.
- Support customer-facing teams by explaining AI feature readiness, expected delivery timelines, and practical capabilities in clear business terms.
- Participate in customer or sales discussions when technical expertise is needed, helping address questions about solution behavior, roadmap direction, and use case alignment.
- Manage investigation and resolution of customer-impacting incidents by coordinating with internal stakeholders and external vendors while providing timely updates.
- Monitor usage patterns, operating costs, vendor changes, model retirements, and security notices, and prepare tested mitigation or migration plans before service is affected. Requirements
- At least 3 years of experience in DevOps, site reliability, software engineering, or production operations supporting live customer environments.
- Strong programming ability in Python with practical experience working with APIs, webhooks, and asynchronous service interactions.
- Proven background operating systems in production with an understanding of reliability, scalability, and incident handling under real-world load.
- Hands-on experience with monitoring and observability platforms such as Datadog, Grafana, New Relic, Amazon CloudWatch, or comparable tools.
- Familiarity with at least one major AI platform, including OpenAI, Claude, Azure OpenAI, Amazon Bedrock, or Google Vertex AI, along with production concerns such as latency, fallback design, rate limits, and cost control.
- Working knowledge of cloud infrastructure and CI/CD practices used to deploy, update, and maintain services consistently.
- Ability to write clear operational documentation, including runbooks and post-incident summaries, and to lead communication during service disruptions.
- Strong communication skills with the confidence to explain technical topics to non-technical stakeholders and customers while maintaining sound security and data-handling practices.
Similar jobs in Minneapolis, MN
Medtronic
Minneapolis, MN
Posted1 day ago
Updated1 hour ago
Similar jobs in Minnesota
State of Minnesota - Minnesota IT Services
Saint Paul, MN
Posted1 day ago
Updated1 hour ago
Noblis
Saint Paul, MN
Posted1 day ago
Updated1 hour ago
Ancestor Health
Duluth, MN
Posted1 day ago
Updated1 hour ago