Vice President of SRE Position Available In Miami-Dade, Florida
Tallo's Job Summary: This job listing has been recently added. Tallo will add a summary here for this job shortly.
Job Description
Job Description:
Position Overview:
The Site Reliability Engineering (SRE) team isthe first line of defense for a smoothly running site. They willmonitor the site metrics on a 24x7x365 basis and ensure any issuesare tracked to resolution. The DevSecOps team (peer) will beresponsible for building out the dashboards that this team uses, soyou will act as the product manager for those dashboards and mayassist in the implementation. The Development teams will haveincreased responsibility to respond to pager duty alerts, so thisteam will ensure that the right teams are engaged and activelyworking issues. The team will continue to perform manual correctiveactions to the databases as needed but will at the same time workwith Product ownership to close product gaps that are leading toexcessive manual work.
Responsibilities:
Establish a three-shift follow-the-sun model for siteoperations, with two shifts resident in our offices in Hyderabadand a third in our headquarters in Plantations, FL. Establish standard policies and procedures for site operationsincluding incident management and problem tracking. Ensure cross-training across all NationsBenefits siteoperations teams. Work with engineering teams for ongoing increased visibilityand reliability. Identify and work across stakeholders to address operationalissues. Report to senior leadership on site incidents and recoveryoperations. Work with DevSecOps and Dev Teams to implement automaticmanagement capabilities such as auto-scaling. Identify product and implementation gaps (including gaps ininstrumentation) that impede the ability of the site to be operatedat scale, including those that require regular manual work. During incidents, work with Operational Governance andEngineering teams to ensure that correct engineering resources areengaged and tracking to problem resolution. Post incident, work with teams to ensure RCAs are complete andcorrect. Establish baseline metrics for all services and monitor overtime. Establish Green/Yellow/Red performance levels and escalateproblems as metrics degrade release to release.
What We Offer:
Competitive salary and benefits package. Opportunity to work on a groundbreaking FinTech applicationwith a high degree of impact. A collaborative and inclusive work environment that fostersinnovation and growth. Career development opportunities, including leadership trainingand mentorship. #J-18808-Ljbffr