Platform Engineer (Site Reliability Engineering - SRE)
Job
Navy Federal Credit Union
Vienna, VA (In Person)
Full-Time
Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
99
out of 100
Average of individual scores
Skill Insights
Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
We are seeking a highly skilled Platform Engineer to design, build, implement, and manage scalable, reliable, and reusable platforms, tools, and environments for application development, deployment, and operations. You will leverage deep technical expertise of container and virtual infrastructure, GitOps practices, Continuous Integration, Continuous Deployment (CI/CD) automation, and infrastructure standardization and optimization. You have strong experience with on-premise and cloud infrastructure to include installation, configuration, administration, support and maintenance of IT software and hardware systems and internal databases for our enterprise clients. You will play a critical role in ensuring performance and resource availability of critical infrastructure supporting customers across Navy Federal Credit Union. Bachelor's degree in Computer Science, Information Technology, Engineering, or equivalent experience 5+ years of experience in platform engineering, DevOps, Site Reliability Engineering (SRE), or related roles Hands-on experience administering and supporting OpenShift or Kubernetes environments Experience implementing GitOps workflows using Argo, Flux Strong experience building CI/CD pipelines with Tekton and GitHub Actions Experience with container technologies such as Docker and Kubernetes Proficiency with Infrastructure as Code tools such as Terraform, Ansible, or similar Strong scripting skills in Bash, Python, or similar languages Experience with cloud platforms such as Amazon Web Servicers (AWS), Azure, or Google Cloud Platform Familiarity with monitoring and observability tools such as Prometheus, Grafana, ELK, or Splunk Strong problem-solving, communication, and collaboration skills Desired Qualifications Certified Kubernetes Administrator (CKA) Certified Kubernetes Application Developer (CKAD)
Microsoft Certified:
Azure Solutions Architect Expert Red Hat Certified System Administrator (RHCSA)Additional Information Hours:
Monday - Friday, 8:00AM - 4:30PM Location:
820 Follin Lane, Vienna, VA 22180 5510 Heritage Oaks Drive, Pensacola, FL 32526 141 Security Drive, Winchester, VA 22602 Manage and automate OCP (OpenShift Container Platform) and ARO (Azure Red Hat OpenShift) cluster upgrades, ensuring alignment with upstream releases and zero-downtime rolling updates Configure tools like OADP (OpenShift API for Data Protection) to handle backup, restore, and failover across on-prem OCP and ARO regions Monitor compute, storage, and networking capacity to prevent bottlenecks in hybrid Kubernetes environments Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) specifically for cluster control planes and underlying cloud resources. Use and tune frameworks (e.g., Prometheus, Grafana) to surface actionable alerts and track core signals like latency, traffic, errors, and saturation Participate in on-call rotations to triage cluster-level issues (e.g., node evictions, Distributed ETC Directory (ETCD) latency, certificate expirations) Manage continuous deployment pipelines using ArgoCD or Flux to ensure declarative, reproducible configurations for all cluster workloads Abstract underlying Kubernetes complexities through self-service portals, standardized namespaces, and automated CI/CD guardrails Consult with development teams to resolve deployment failures, routing, and pod-crash issues Enforce Role-Based Access Control (RBAC) across environments, integrating with Active Directory or Azure Entra ID Implement platform-level security guardrails using OpenShift policies and Open Policy Agent (OPA) to automatically detect configuration drifts or vulnerabilities Integrate enterprise vaults or Azure Key Vault seamlessly with native OpenShift secrets Treat the platform as software by managing ARO clusters and underlying Azure resources using Terraform or Bicep Write automation scripts (Python, Go, or Ansible) to streamline common Day 2 configuration tasks like deploying custom operators, storage classes (e.g., ODF), and monitoring add-ons Collaborate with development, security, and operations teams to improve platform reliability and developer experience using iterative Agile processes and practices Create and maintain technical documentation, standards, and operational procedures using Docs as Code Support continuous improvement initiatives focused on scalability, resiliency, and automation of both on-prem and cloud environmentsSimilar remote jobs
Commonwealth of PA
Pennsylvania
Posted1 day ago
Updated2 hours ago
Similar jobs in Vienna, VA
The University of Arizona Oracle, Arizona
Vienna, VA
Posted1 day ago
Updated2 hours ago
Similar jobs in Virginia
DuPont Community Credit Union
Waynesboro, VA
Posted1 day ago
Updated2 hours ago