Job Description
Senior Software Engineer at Harrison Clarke Senior Software Engineer at Harrison Clarke in Daly City, California Posted in 3 days ago.
Type:
full-time Job Description:
Senior Platform Engineer - AI Infrastructure $200-$300k base + Equity (depending on leveling) We're building the infrastructure platform powering real-time AI workloads across distributed GPU environments. This is not a traditional DevOps role focused purely on CI/CD pipelines or infrastructure maintenance. We're looking for a senior platform engineer who enjoys building and evolving complex infrastructure systems - someone who can operate across Kubernetes, distributed systems, networking, observability, and deployment architecture in a highly ambiguous, fast-moving environment. You'll help shape the foundational infrastructure layer for a production AI platform running across multiple regions, cloud providers, and GPU environments. What You'll Work On Build and evolve multi-region Kubernetes infrastructure across AWS and GPU cloud providers Design and improve internal platform systems that enable engineering and ML teams to deploy reliably at scale Own infrastructure-as-code across environments using Terraform and modern GitOps workflows Improve deployment architecture, cluster scalability, reliability, and operational efficiency Work on observability across distributed systems using metrics, logs, traces, and profiling Partner closely with ML and backend engineers on model serving infrastructure, workload optimization, and platform performance Improve networking and connectivity across distributed environments, including ingress, gateways, load balancing, and cross-region traffic Help shape security, secrets management, access controls, and infrastructure compliance practices Contribute to developer experience through tooling, automation, and infrastructure abstractions What We're Looking For We're looking for engineers who think in systems, not silos. You'll likely have experience across several of the following: Operating Kubernetes in production at meaningful scale Building infrastructure platforms rather than simply maintaining deployments Infrastructure-as-code using Terraform, Pulumi, or similar tooling GitOps workflows using ArgoCD, FluxCD, Helm, or Kustomize Distributed infrastructure and multi-region environments Observability tooling such as Prometheus, Grafana, OpenTelemetry, or similar Strong infrastructure fundamentals across networking, reliability, and security Experience coding in Go or Python where needed for tooling, automation, or infrastructure systems Working in startup or high-growth environments with significant ownership and ambiguity Nice to Have GPU infrastructure or ML platform exposure Experience with AI inference or model-serving systems Real-time networking or media infrastructure experience Exposure to GPU cloud providers such as CoreWeave, Crusoe, Lambda, or similar FinOps or infrastructure cost optimization experience What This Role Is Not This is not a narrowly scoped DevOps or support engineering position. We're not looking for someone focused solely on maintaining CI/CD pipelines or reacting to tickets. We're looking for engineers who enjoy designing systems, improving infrastructure foundations, and solving complex operational problems across the platform stack. Why Join You'll have significant ownership over foundational infrastructure decisions in a company scaling real-world AI workloads in production. This role is ideal for engineers who enjoy: solving ambiguous infrastructure problems, building scalable internal platforms, operating distributed systems, and working close to the intersection of infrastructure, performance, and developer enablement.