Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Responsible for working within the Site Reliability Tools (SRT) team to build next-generation monitoring and site analytics capabilities; build and coordinate a large-scale observability pipelines for PlayStation; participate in the full life cycle of system design, from concept, architecture, data governance, application decomposition, to deployment and creation of KPIs for tracking adoption; participate in the design, logic, flowcharting, data ingestion, data governance, visualization development, testing, debugging, documentation, and support of observability tools infrastructure; provide analysis of problems, recommend solutions, and assist in continuous improvement initiatives; develop, implement, and maintain Enterprise observability solutions; perform configure, upgrade, scale out, patch, and tune Splunk Enterprise and CRIBL Stream; support governance of Splunk and CRIBL usage in order to provide an efficient platform; collaborate with architects, senior engineers, stakeholders, and leadership to create an observability architecture; utilize and apply knowledge of AWS, Azure, GCP, Datadog, Splunk, Grafana, CloudWatch, Terraform, Kubernetes, Docker, CloudFormation, and enterprise observability platforms to perform assigned tasks; conduct frequent capacity and cost reviews of observability tools; build automation and self-service capabilities to improve day-to-day operations; troubleshoot incidents and provide RCAs; and create technical documentation that enables operations teams to support the observability stack.
Location:
Troy, Michigan and multiple undetermined worksites throughout the US;
Salary:
$131,164 per year (benefits include medical, dental, vision, 401(k), STD/LTD, life insurance, and EAP)
Education:
Bachelors - Computer Science, Computer Engineering, Information Technology, Information Studies, or in a related field of study (will accept equivalent foreign degree).
Training:
None Experience:
Two (2) years in the position above, as a DevOps Engineer, as a Platform Engineer, as a Reliability Engineer, as a Software (Site Reliability) Engineer, or in a related occupation.