Data Lakehouse Architect
Job
SEACORP
Manassas, VA (In Person)
Full-Time
Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
75
out of 100
Average of individual scores
Skill Insights
Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Data Lakehouse Architect 4.2 4.2 out of 5 stars Manassas, VA
Hands-on experience with Apache Iceberg or a similar open table format in large-scale analytical environments. Experience designing data platforms on object storage, including Amazon S3, CEPH, or equivalent S3-compatible storage systems. Experience with Trino or similar distributed SQL query engines for interactive analytics over large datasets. Strong understanding of distributed systems principles, including scalability, fault tolerance, consistency tradeoffs, and performance tuning. Experience with data modeling, schema design, partitioning strategy, and optimization for analytical workloads. Experience with security architecture including role-based access control, encryption, and data governance controls. Experience creating architecture documentation, technical standards, and implementation roadmaps. Strong knowledge of batch and streaming pipeline patterns, including CDC, event-driven design, and ingestion orchestration.
SEACORP 11
reviews SEACORP is seeking a well-qualified Data Lakehouse Architect .Primary Duties and Responsibilities:
Job Summary :
SEACORP is seeking a Data Lakehouse Architect to lead the design, implementation, and evolution of a modern, tiered data platform that supports scalable ingestion, storage, processing, governance, and analytics. This position is in support of our SWFTS Data Strategy and Data Pipeline program. This role will define the target-state architecture for a lakehouse environment built on technologies including Kafka, Apache Iceberg, Amazon S3, CEPH, and Trino, while ensuring the platform is secure, performant, reliable, and cost-effective. The architect will partner with engineering, platform, analytics, security, and business teams to establish architectural standards, guide implementation, and enable high-quality data products across batch and streaming domains. The ideal candidate combines deep technical expertise in distributed data systems with strong design judgment, leadership, and the ability to translate business requirements into durable platform capabilities.Job Responsibilities Include:
Design and document lakehouse architecture using Kafka for streaming ingestion, Iceberg for table format and data management, S3 and/or CEPH for object storage, and Trino for distributed SQL query access. Define architecture for data partitioning, compaction, schema evolution, metadata management, table maintenance, and lifecycle policies. Architect data ingestion frameworks for both real-time and batch workloads, including event-driven and CDC-based integration patterns. Establish scalable, resilient, and secure storage patterns across cloud and on-premises or hybrid object storage environments. Define governance patterns including access control, encryption, data retention, lineage, auditability, and compliance integration. Partner with data engineers to optimize query performance, file sizing, partitioning strategy, and workload concurrency in Trino and related engines. Lead engineering teams and review designs, code, and deployment approaches for alignment with target architecture.Qualifications:
Education:
Bachelor's degree in Computer Science, Engineering, Information Systems, or a related technical field.Required Experience:
Required knowledge of Atlassian Tool Suite, Git, and Linux. Preferred knowledge in C++, Java, Python, Linux. Candidate should have the ability to work in a fast-paced work environment. Able to collaborate with others while being able to handle independent tasking. Ability to learn new technologies quickly. 7+ years of experience in data engineering, data architecture, or platform architecture roles.3+ years of experience designing and implementing modern data lake or lakehouse architectures in production environments. Hands-on experience with Apache Kafka for streaming data ingestion, event architecture, or real-time data integration.Hands-on experience with Apache Iceberg or a similar open table format in large-scale analytical environments. Experience designing data platforms on object storage, including Amazon S3, CEPH, or equivalent S3-compatible storage systems. Experience with Trino or similar distributed SQL query engines for interactive analytics over large datasets. Strong understanding of distributed systems principles, including scalability, fault tolerance, consistency tradeoffs, and performance tuning. Experience with data modeling, schema design, partitioning strategy, and optimization for analytical workloads. Experience with security architecture including role-based access control, encryption, and data governance controls. Experience creating architecture documentation, technical standards, and implementation roadmaps. Strong knowledge of batch and streaming pipeline patterns, including CDC, event-driven design, and ingestion orchestration.
Desired Experience:
Desired knowledge in the areas of Databases, SQL and No-SQL (Postgres, MongoDB), Apache Data Frameworks (Kafka, Spark, Iceberg, OpenMetadata, Ranger), Data Infrastructure (Ceph, S3, MinIO/Parquet, REST, Nessie, Druid), Data APIs (Trino, Metabase, MLLib, Superset). Desired knowledge in the areas of software prototyping, VS Code, Cursor IDE, and prompt engineering. Master's degree in Computer Science, Data Engineering, Distributed Systems, or a related field. Experience with Team Submarine, SWFTS, US Navy program offices, TI/APB cycle Experience with metadata catalogs such as Hive Metastore, AWS Glue Catalog, Nessie, or Polaris. Familiarity with data processing engines such as Spark, Flink, or dbt in lakehouse environments. Experience implementing data quality, observability, and lineage tooling. Experience supporting hybrid or multi-cloud data architectures. Familiarity with Kubernetes-based deployment and platform operations. Experience with regulated data environments and compliance frameworks such as SOC 2, HIPAA, PCI-DSS, or FedRAMP.Exceptional Qualifications:
Candidates possessing knowledge in these technologies will be considered exceptional candidates including Kubernetes, RKE2, containerization, Helm, AI/ML APIs, SparkML, AI/ML
Integration (LLM Development Stack), DPCN training, PINN training, or agentic development integration. Recognized expertise designing enterprise-scale lakehouse platforms using open standards and interoperable tooling. Experience delivering software and systems for Team Submarine or SWFTS programs, including experience with the Submarine platform tactical systems. Deep production experience with Kafka + Iceberg + Trino architectures, including performance optimization and operational scaling. Experience building platforms that span cloud and on-premises object storage, especially S3 and CEPH in hybrid deployments. Demonstrated success leading architecture for high-volume, low-latency, and mission-critical data ecosystems. Ability to make principled architectural decisions regarding catalogs, table maintenance, file formats, compaction, and query federation. Strong record of mentoring senior engineers and establishing architecture review processes and engineering standards. Experience leading major data platform migrations from legacy warehouse, Hadoop, or tightly coupled ETL ecosystems to modern lakehouse architectures. Ability to balance long-term architectural integrity with pragmatic delivery timelines and business value. As a requirement of employment, all SEACORP employees must hold U.S.Citizenship Location:
Manassas, VA Travel:
Quarterly (approximately 4 times a year)Clearance:
Secret Work Environment & Physical Demands:
Office & Computer Laboratories - Sitting, standing, extended periods of time using a mouse and keyboard and viewing computer screens. Infrequent lifting ofSimilar remote jobs
UnitedHealth Group
Fort Wayne, IN
Posted2 days ago
Updated4 hours ago
Similar jobs in Manassas, VA
Compassionate ABA Therapeutic Services LLC
Manassas, VA
Posted2 days ago
Updated4 hours ago
Giant Food - PROD
Manassas, VA
Posted2 days ago
Updated4 hours ago
Giant Food - PROD
Manassas, VA
Posted2 days ago
Updated4 hours ago
Similar jobs in Virginia
DNI Delaware Nation Industries
Alexandria, VA
Posted2 days ago
Updated4 hours ago
Virginia Zoological Society
Norfolk, VA
Posted2 days ago
Updated4 hours ago