Tallo logoTallo logo

Data Lakehouse Architect

Job

SEA CORP

Middletown, RI (In Person)

Full-Time

Posted 1 week ago (Updated 22 hours ago) • Actively hiring

Expires 6/7/2026

Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
75
out of 100
Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

SEA CORP
Location:
Middletown, RI, United States Req ID:
req1806 SEACORP is seeking a well-qualified Data Lakehouse Architect .
Primary Duties and Responsibilities:
Job Summary :
SEACORP is seeking a Data Lakehouse Architect to lead thedesign, implementation, and evolution of a modern, tiered data platform thatsupports scalable ingestion, storage, processing, governance, and analytics. Thisposition is in support of our SWFTS Data Strategy and Data Pipelineprogram. This role will define thetarget-state architecture for a lakehouse environment built on technologiesincluding Kafka, Apache Iceberg, Amazon S3, CEPH, and Trino, while ensuring theplatform is secure, performant, reliable, and cost-effective. The architect will partner with engineering, platform,analytics, security, and business teams to establish architectural standards,guide implementation, and enable high-quality data products across batch andstreaming domains. The ideal candidate combines deep technical expertise indistributed data systems with strong design judgment, leadership, and theability to translate business requirements into durable platform capabilities.
Job Responsibilities Include:
Design and document lakehouse architecture using Kafka for streaming ingestion, Iceberg for table format and data management, S3 and/or CEPH for object storage, and Trino for distributed SQL query access. Define architecture for data partitioning, compaction, schema evolution, metadata management, table maintenance, and lifecycle policies. Architect data ingestion frameworks for both real-time and batch workloads, including event-driven and CDC-based integration patterns. Establish scalable, resilient, and secure storage patterns across cloud and on-premises or hybrid object storage environments. Define governance patterns including access control, encryption, data retention, lineage, auditability, and compliance integration. Partner with data engineers to optimize query performance, file sizing, partitioning strategy, and workload concurrency in Trino and related engines. Lead engineering teams and review designs, code, and deployment approaches for alignment with target architecture.
Qualifications:
Education:
Bachelor's degree in Computer Science, Engineering, Information Systems, or a related technical field.
Required Experience:
Required knowledge of Atlassian Tool Suite, Git, and Linux. Preferred knowledge in C++, Java, Python, Linux. Candidate should have the ability to work in a fast-paced work environment. Able to collaborate with others while being able to handle independent tasking. Ability to learn new technologies quickly. 7+ years of experience in data engineering, data architecture, or platform architecture roles.3+ years of experience designing and implementing modern data lake or lakehouse architectures in production environments. Hands-on experience with Apache Kafka for streaming data ingestion, event architecture, or real-time data integration.

Hands-on experience with Apache Iceberg or a similar open table format in large-scale analytical environments. Experience designing data platforms on object storage, including Amazon S3, CEPH, or equivalent S3-compatible storage systems. Experience with Trino or similar distributed SQL query engines for interactive analytics over large datasets. Strong understanding of distributed systems principles, including scalability, fault tolerance, consistency tradeoffs, and performance tuning. Experience with data modeling, schema design, partitioning strategy, and optimization for analytical workloads. Experience with security architecture including role-based access control, encryption, and data governance controls. Experience creating architecture documentation, technical standards, and implementation roadmaps. Strong knowledge of batch and streaming pipeline patterns, including CDC, event-driven design, and ingestion orchestration.
Desired Experience:
Desired knowledge in the areas of Databases, SQL and No-SQL (Postgres, MongoDB), Apache Data Frameworks (Kafka, Spark, Iceberg, OpenMetadata, Ranger), Data Infrastructure (Ceph, S3, MinIO/Parquet, REST, Nessie, Druid), Data APIs (Trino, Metabase, MLLib, Superset). Desired knowledge in the areas of software prototyping, VS Code, Cursor IDE, and prompt engineering. Master's degree in Computer Science, Data Engineering, Distributed Systems, or a related field. Experience with Team Submarine, SWFTS, US Navy program offices, TI/APB cycle Experience with metadata catalogs such as Hive Metastore, AWS Glue Catalog, Nessie, or Polaris. Familiarity with data processing engines such as Spark, Flink, or dbt in lakehouse environments. Experience implementing data quality, observability, and lineage tooling. Experience supporting hybrid or multi-cloud data architectures. Familiarity with Kubernetes-based deployment and platform operations. Experience with regulated data environments and compliance frameworks such as SOC 2, HIPAA, PCI-DSS, or FedRAMP.
Exceptional Qualifications:
Candidates possessing knowledge in these technologies will be considered exceptional candidates including Kubernetes, RKE2, containerization, Helm, AI/ML APIs, Spark
ML, AI/ML
Integration (LLM Development Stack), DPCN training, PINN training, or agentic development integration. Recognized expertise designing enterprise-scale lakehouse platforms using open standards and interoperable tooling. Experience delivering software and systems for Team Submarine or SWFTS programs, including experience with the Submarine platform tactical systems. Deep production experience with Kafka + Iceberg + Trino architectures, including performance optimization and operational scaling. Experience building platforms that span cloud and on-premises object storage, especially S3 and CEPH in hybrid deployments. Demonstrated success leading architecture for high-volume, low-latency, and mission-critical data ecosystems. Ability to make principled architectural decisions regarding catalogs, table maintenance, file formats, compaction, and query federation. Strong record of mentoring senior engineers and establishing architecture review processes and engineering standards. Experience leading major data platform migrations from legacy warehouse, Hadoop, or tightly coupled ETL ecosystems to modern lakehouse architectures. Ability to balance long-term architectural integrity with pragmatic delivery timelines and business value. As a requirement of employment, all SEACORP employees must hold U.S.
Citizenship Location:
Manassas, VA Travel:
Quarterly (approximately 4 times a year)
Clearance:
Secret Work Environment & Physical Demands:
Office & Computer Laboratories - Sitting, standing, extended periods of time using a mouse and keyboard and viewing computer screens. Infrequent lifting of

Similar remote jobs

Similar jobs in Middletown, RI

Similar jobs in Rhode Island