Tallo logoTallo logo

Incident Management Lead

Job

Tata Consultancy Services Limited

Deerfield, IL (In Person)

Full-Time

Posted 3 weeks ago (Updated 2 hours ago) • Actively hiring

Expires 6/19/2026

Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
82
out of 100
Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

Must Have Technical/Functional Skills 6+ years of IT Service Management experience with a minimum of 3 years in a dedicated Major Incident Management or Incident Commander role in a large enterprise (Fortune 500 / FTSE 100 equivalent complexity). ITIL 4 Managing Professional or ITIL 4
Specialist:
High Velocity IT certification (ITIL 4 Foundation minimum required). Demonstrable experience managing Azure platform incidents: working knowledge of Azure Monitor, Azure Service Health, Log Analytics, Application Insights, and Microsoft support escalation paths. Proven ability to command high-pressure P1 incidents involving 20+ stakeholders across technical and executive levels simultaneously Expert-level proficiency in ServiceNow ITSM, including Incident, Problem, Change modules and dashboard/report building.
Strong data analysis skills:
ability to analyze incident trends, build KPI dashboards, and present actionable insights to senior leadership. Roles & Responsibilities Major Incident Command & Coordination Serve as the single accountable owner for all P1 and P2 major incidents across on premises and Azure-hosted services, from initial declaration through resolution and post-incident closure. Convene and chair live incident bridge calls and virtual war rooms using Microsoft Teams, coordinating across 10+ internal technical resolver groups, managed service partners, and Microsoft Azure Support (Unified Support escalations). Drive swift triage by leveraging Azure Service Health, Resource Health, and Azure Monitor dashboards to rapidly establish scope, affected services, and blast radius within the first 15 minutes of an incident. Make and enforce escalation decisions, including engaging Microsoft CSS P1 Severity A support cases and activating DR runbooks where service restoration via normal means is not achievable within RTO. Maintain clear, timely, and audience-appropriate stakeholder communications throughout the incident lifecycle, including
CEO/CISO
executive briefings for business-critical outages. Post-Incident Review & Continual Improvement Facilitate structured blameless Post-Incident Reviews (PIRs) within agreed SLAs (P1: 48 hours. P2: 5 business days); produce high-quality PIR reports consumed by CTO and Board Technology Committee. Own the incident action item registry; chair weekly SIP (Service Improvement Plan) reviews to ensure commitments are delivered on time and to quality. Identify systemic incident patterns through trend analysis using ServiceNow and Log Analytics. collaborate with Problem Management to drive root cause elimination for repeat incidents. Define, track, and report on enterprise incident management
KPIs:
MTTD, MTTR, incident recurrence rate , SLA compli ance, and customer impact hours —presented to IT leadership in monthly operational reviews. Process Ownership & ITSM Governance Own, maintain, and continuously improve the enterprise Major Incident Management process, policy, playbooks, and runbooks aligned to ITIL 4 and the organization's IT Risk and Control Framework. Define and govern the incident severity classification matrix and escalation decision tree. ensure consistent adoption across all IT towers and managed service partners. Maintain and test the enterprise crisis communication framework, including stakeholder notification trees, bridge protocols, and executive communication templates. Collaborate with Change Management to ensure CAB processes adequately assess change- induced incident risk; maintain correlation tracking between changes and incidents. Azure Operations & Cloud Incident Specifics Develop and maintain Azure-specific incident playbooks covering platform scenarios: AKS node/pod failures, Azure SQL failover events, ExpressRoute circuit drops, Azure Active Directory (Entra ID) authentication outages, and Azure region-wide service incidents. Maintain working relationships with Microsoft TAM (Technical Account Manager) and Azure Rapid Response team: ensure escalation paths to Microsoft CSS are exercised and SLAs understood. Monitor Azure Service Health and Microsoft 365 Service Health Dashboard proactively. initiate pre-emptive incident declarations for advisory/degraded-service notifications affecting business-critical services. Participate in Azure Operational Reviews with Cloud Platform and SRE teams to identify observability gaps, alerting blind spots, and runbook deficiencies before they manifest as major incidents. Capability Building & Stakeholder Engagement Design and deliver MIM process training programmes for Level 1/2 Service Desk, resolver groups, and technology leadership; conduct quarterly simulation exercises (GameDay / IncidentEx).

Similar remote jobs

Similar jobs in Deerfield, IL

Similar jobs in Illinois