Skip to main content
Tallo logoTallo logo
Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Site Reliability Engineer (SRE) - Azure

Job

Matlen Silver

Chandler, AZ (In Person)

Full-Time

Posted 6 days ago (Updated 4 days ago) • Actively hiring

Expires 7/21/2026

Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
96
out of 100
Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

SO5 - Public Cloud - Azure Sub-effort 68: The requested Azure SRE role will ensure reliability and operational excellence for Azure platform services delivered through the product operating model. The role partners closely with horizontal teams across platform engineering, networking, security, and AI infrastructure to embed reliability into Azure-native solutions. Responsibilities include supporting observability, incident response, automation, resilience testing, and operational runbook development for Azure landing zones, ingress and egress DMZs, service integration layers, and AI infrastructure supporting Microsoft Foundry and private OpenAI access. The role ensures platform components meet defined reliability, availability, and scalability objectives while enabling consistent onboarding and sustained operation of enterprise workloads in Azure.
Primary SkillMicrosoft Azure Desired Skills Experience:
Direct experience supporting AI infrastructure in Azure, including platforms enabling Microsoft Foundry and private OpenAI access. Bring strong familiarity with designing and operating highly available service integration layers and network perimeters in complex enterprise environments. Have advanced experience integrating reliability objectives into platform product operating models and influencing reliability standards across horizontal teams. Demonstrated strong skills in automation, platform observability design, and driving operational excellence across shared Azure services.
Required Skills Experience:
Strong experience operating and supporting Azure platform services with a clear focus on reliability, availability, and scalability. Have hands-on experience with Site Reliability Engineering practices, including observability, incident response, automation, resilience testing, and operational readiness. Collaborated effectively with platform engineering, networking, security, and infrastructure teams to embed reliability into Azure-native solutions. Supported enterprise Azure landing zones, ingress and egress DMZs, service integration layers, and shared platform components. Developed and maintained operational runbooks and ensure platforms support consistent onboarding and sustained operation of enterprise workloads. Solid experience operating secure Azure environments and supporting AI infrastructure workloads at scale. Secondary SkillCloud Implementation (Data on Cloud, Integration with Cloud) Tertiary SkillCloud Architect