Skip to main content
Tallo logoTallo logo
Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Head of High Availability Systems Engineering

Job

Executive

Rutherford, NJ (In Person)

$235,000 Salary, Full-Time

Posted 1 week ago (Updated 1 week ago) • Actively hiring

Expires 6/23/2026

Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
74
out of 100
Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

Citi, a leading global bank with approximately 200 million customer accounts in over 160 countries, provides a broad range of financial products and services to consumers, corporations, governments, and institutions. The bank's Enterprise Operations & Technology division underpins these offerings, delivering secure, reliable, and efficient technology solutions that are foundational to managing global resources, ensuring safety, and providing a first-class customer experience. This reflects Citi's mission to create economic value that is systemically responsible and in its clients' best interests. Fostering a culture of diversity and inclusion, Citi is committed to a workforce that represents the clients it serves. The company values respect, promotes individuals based on merit, and ensures opportunities for personal development are widely available. Ideal candidates are passionate, innovative problem-solvers who contribute to a culture of delivering results with pride and are empowered to enable growth and progress together with the firm. This is a rare opportunity to lead a globally distributed engineering team at the forefront of mission-critical, High Availability infrastructure. As a Head of High Availability Systems Engineering , you will own the strategy, stability, and evolution of a large-scale z/TPF, Stratus, and I-Series estate that underpins enterprise-grade operations around the clock. If you thrive at the intersection of deep technical mastery and senior leadership, this role offers the scope, complexity, and impact to match your ambition.
Key Responsibilities:
Team Leadership & Talent Development Lead, mentor, and grow a high-performing global team of z/TPF, Stratus, and I-Series Systems Programmers Foster a culture of engineering excellence, continuous learning, and accountability across geographically distributed teams Set clear performance expectations and develop career pathways for senior technical contributors Platform Engineering & System Management Oversee the installation, configuration, maintenance, and upgrades of z/TPF, Stratus, and I-Series software environments Drive end-to-end lifecycle management of complex, large-scale HA infrastructure — from hardware components to operating systems and related software Administer and optimize the IBM Security Portal, ensuring robust vulnerability management practices are embedded across the estate Performance Optimization & Capacity Planning Proactively monitor system performance, identify bottlenecks, and architect solutions that maximize throughput and reliability Lead capacity planning initiatives, translating business growth projections into actionable infrastructure roadmaps Implement performance tuning strategies across z/TPF, VOS, and IOS environments Security & Compliance Champion comprehensive security policies and procedures that protect sensitive data and preserve system integrity Maintain deep familiarity with mainframe security tooling, including Crypto, FICON, and platform-specific security utilities Ensure compliance with enterprise security standards and regulatory requirements Business Continuity & Disaster Recovery Architect and maintain enterprise-grade disaster recovery solutions, with expert-level command of storage replication technologies Lead continuity-of-business planning, testing, and execution to ensure zero-compromise availability for critical systems Cross-Functional Collaboration & Process Improvement Partner closely with application support, database administration, and network engineering teams to deliver integrated, resilient solutions Identify and implement process improvements that drive operational efficiency, reduce toil, and elevate engineering standards Maintain rigorous, up-to-date documentation of system configurations, runbooks, and troubleshooting procedures
Skills Required:
Technical Expertise Deep, hands-on expertise in z/TPF Administration , including system programming, tuning, and troubleshooting at enterprise scale Proficiency with Stratus VOS and I-Series IOS operating systems and associated toolsets Strong working knowledge of mainframe hardware and software ecosystems, including Crypto , FICON , and platform utilities Demonstrated experience with IBM Security Portal and enterprise vulnerability management frameworks Expert-level understanding of disaster recovery architectures and storage replication technologies Leadership & Soft Skills Proven track record of leading and scaling global, senior-level engineering teams in high-stakes environments Exceptional analytical and problem-solving capabilities — able to diagnose and resolve complex, multi-layered system failures under pressure Outstanding communication and stakeholder management skills, with the ability to translate technical complexity for executive audiences Strategic thinker who balances long-term platform vision with day-to-day operational excellence Collaborative leader who builds trust across organizational boundaries and drives alignment across diverse IT disciplines
Minimum Qualification:
15+ years of overall experience in systems engineering or infrastructure roles Minimum 8 years of hands-on z/TPF Administration experience in large-scale, production environments Demonstrated experience managing global engineering teams in a senior leadership capacity Proven expertise in High Availability systems design, disaster recovery, and business continuity planning ​​
Education:
Bachelor's degree/University degree or equivalent experience Master's degree preferred This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.
  • Job Family Group:
    Technology
  • Job Family:
    Systems & Engineering
  • Time Type:
    Full time
  • Primary Location:
    Rutherford New Jersey United States
  • Primary Location Full Time Salary Range:
    $170,000.00
  • $300,000.
00 In addition to salary, Citi's offerings may also include, for eligible employees, discretionary and formulaic incentive and retention awards. Citi offers competitive employee benefits, including: medical, dental & vision coverage; 401(k); life, accident, and disability insurance; and wellness programs. Citi also offers paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays. For additional information regarding Citi employee benefits, please visit citibenefits.com. Available offerings may vary by jurisdiction, job level, and date of hire.
  • Most Relevant Skills Please see the requirements listed above.
  • Other Relevant Skills For complementary skills, please see above and/or contact the recruiter.
  • Anticipated Posting Close Date:
    May 31, 2026
  • Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi . View Citi's EEO Policy Statement and the Know Your Rights poster.