Facilitate smooth transitions from development to Production, liaising with stakeholders to set and meet enterprise technical standards.
Documentation & Standards:
Produce and maintain comprehensive documentation - architecture diagrams, network configurations, troubleshooting guides, and deployment workflows.
System Monitoring & Health Maintenance:
Set up and utilize monitoring and logging tools to proactively identify and resolve performance or stability issues.
Qualifications:
5 7 years in Systems or Infrastructure Engineering with a strong track record in large scale deployment projects. 3+ years of hands-on experience with OpenShift/Kubernetes, preferably in on-prem environments. Experience with containerization (Docker, Kubernetes). Demonstrated understanding of OpenShift Container Platform (OCP) architecture in large Enterprise Environment. Knowledgeable with various assets in OCP environment such as Pods, Deployments, StatefulSets, Services, Routes, Namespaces / Projects ConfigMaps, Secrets Persistent Volumes / Persistent Volume Claims Resource requests/limits Node, cluster, and container troubleshooting Strong experience in doing RCA for Pod crashes Restart loops Containerized Databases instances such as
Postgres, Star Rocks Networking & Security:
Solid knowledge of networking (TCP/IP, DNS, DHCP), firewall configurations, and security best practices.
Monitoring Tools:
Familiarity with system monitoring, logging, and incident remediation frameworks.
Core Competencies:
Strong project coordination, proactive problem-solving, and cross-functional communication skills essential. 3+ years of experience working with PostgreSQL Strong scripting/automation abilities in Python, or similar.