Major Incident Manager
TEKsystems
Remote
$182,000 Salary, Full-Time
Skill Insights
Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Incident Managers:
Hours:
Most hires for 2:00pm-10:00pm, other hires for 8:00am-5:00pm Central. Must have flexibility to work extended or off-hours during incidents as required. In the event of an extended shift, the intention is to adjust hours to keep it as close to 40 as possible, but depending on the situation may be required to work overtime. Top 3 Priorities for this role: 1. Stakeholder Communication 2. Technical Depth (infrastructure centric) 3. Leadership Experience - not just years of experience but depth of what candidate has actually done in career Position Summary The Major Incident Manager is a high-impact leadership role responsible for the end-to-end management of critical IT service disruptions that affect end users and operations. This role goes beyond traditional incident coordination and requires command authority, strong executive communication, and deep understanding of IT operations. This role exists because the current incident model is not meeting enterprise needs and requires experienced leadership to reset and stabilize Tier 1 incident response across multiple facilities. This role is accountable for incident outcomes, not coordination alone. Role Summary The Major Incident Manager acts as the "Incident Commander," (single point of command) driving the swift restoration of critical services while maintaining transparent communication with executive leadership and clinical stakeholders. This role is responsible for incident outcomes, not just processes.Key Responsibilities Incident Command & Coordination:
- Lead the Major Incident Bridge by facilitating 24/7 technical bridge calls and "war rooms" to triage and resolve Priority 1 (P1) and Priority 2 (P2) incidents impacting multiple systems.
- MIM impact to multiple facilities, wider user impacts, or multiple technology failures.
- Lead real-time decision-making under pressure, balancing technical recovery with safety and clinical impact.
- Resources will not typically be responsible for concurrent major incident ownership
Clinical Impact Assessment:
- Quickly evaluate the scope of technical outages to determine their impact on patient safety and critical business applications.
- Outages may impact EHR systems, Clinical Imaging, Network dependencies, as examples.
- Identify trends, anomalies, and recurring failure patterns
Stakeholder Communication:
- Issue regular, clear updates to leadership (CIO/CTO) and Service Portfolio leads using pre-defined communication templates and protocols.
- Maintain confidence and calm in high-pressure situations with leaders and clinical partners.
Post-Incident Management:
- Own and facilitate after action reports and Post-Incident Reviews (PIR) within 48 hours to identify root causes and drive preventive actions.
- Conduct root cause discussions and ensure corrective actions are identified and tracked.
- Communication and stakeholder management are the number-one success factor for this role and is treated as top priority for candidate review.
Process Improvement:
- Own and continuously optimize the enterprise incident management process in alignment with ITIL best practices.
Vendor & Matrix Management:
- Coordinate with third-party vendors and internal cross-functional teams (Network, Security, Clinical Apps) to ensure rapid service recovery.
Monitoring and Alerting:
- Utilize AIOps, triaging and monitoring tools, dashboards, and alerting systems across on-premise and cloud environments (e.
Incident Response:
- Serve as an escalation point for complex operational incidents, guiding technical teams to swift and effective resolution for critical monitoring issues.
Performance Optimization:
- Analyze monitoring data and performance metrics (MTTD, MTTA, MTTR, Incident Recurrence Rate, SLA Compliance) to identify trends, anomalies, and potential issues, providing recommendations for improvement and capacity planning.
Automation:
- Identify and implement automation opportunities for major incident management and routine tasks to reduce manual workload and improve efficiency.
Collaboration and Documentation:
- Collaborate with cross-functional teams (Application, Network, Security, Cloud Enablement, Managed Service Provider, etc.
Problem Management:
- Participate in root cause analysis (RCA) and post-incident reviews to prevent recurring issues and drive long-term solutions.
Qualifications and Skills:
- 8+ years of progressive IT operations or enterprise support experience.
- 5-7+ years in IT major incident management or IT service management Soft skills:
- Excellent crisis leadership and decision-making abilities under pressure.
- Strong analytical mindset with the ability to interpret complex data and performance metrics to drive strategic decisions.
- Exceptional communication and interpersonal skills, capable of effective collaboration with stakeholders at all organizational levels.
- Strong problem-solving and organizational skills, with a focus on details and time management.
Preferred Qualifications and Certifications:
ITIL 4 Foundation or higher. Experience with industry-standard monitoring and observability tools (e.g., ScienceLogic, SolarWinds). Work Environment- Dynamic and collaborative work environment.
- May require off-hours work or participation in an on-call rotation to support operations.
- Travel requirements may include quarterly onsite business meetings and occasional travel abroad for vendor relationship management.
- This role often operates in a dynamic, fast-paced, remote operations center and may require working irregular hours, including nights, weekends, and holidays, to help manage critical incidents and service delivery escalations.
- Medical, dental & vision
- Critical Illness, Accident, and Hospital
- 401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available
- Life Insurance (Voluntary Life & AD&D for the employee and dependents)
- Short and long-term disability
- Health Spending Account (HSA)
- Transportation benefits
- Employee Assistance Program
- Time Off/Leave (PTO, Vacation or Sick Leave) Workplace Type This is a fully remote position.
San Francisco Fair Chance Ordinance:
Pursuant to the San Francisco Fair Chance Ordinance, for all positions located in the city and county of San Francisco, we will consider for employment qualified applicants with arrest and conviction records.Massachusetts Lie Detector:
It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability. Use of Artificial Intelligence (AI): We may use Artificial Intelligence (AI) to support parts of our hiring process, including sourcing, screening, and evaluating candidates. AI helps assess applications and qualifications, but final decisions are made by our hiring team. By applying, you acknowledge and agree that your application may be reviewed using AI tools.Similar remote jobs
Syneos Health/ inVentiv Health Commercial LLC
Waltham, MA
Posted1 day ago
Updated8 hours ago
ANSI National Accreditation Board (ANAB)
Fort Wayne, IN
Posted1 day ago
Updated8 hours ago
UNC Health Care System
Chapel Hill, NC
Posted1 day ago
Updated8 hours ago
Similar jobs in Saint Louis, MO
SSM Health
Saint Louis, MO
Posted1 day ago
Updated8 hours ago
Aya Healthcare
Saint Louis, MO
Posted1 day ago
Updated8 hours ago
Similar jobs in Missouri
OAP Transportation, LLC d/b/a O'Reilly Auto Parts
West Alton, MO
Posted1 day ago
Updated8 hours ago
Epiphany Dermatology
Riverside, MO
Posted1 day ago
Updated8 hours ago
Federal Aviation Administration
Saint Ann, MO
Posted1 day ago
Updated8 hours ago