Apply for this opportunity

This job application is on an outside website. Be sure to review the job posting there to verify it's the same.

Apply Offsite

Data Engineer

Job

Software Guidance & Assistance

Rockville, MD (In Person)

$173,680 Salary, Full-Time

Posted 3 weeks ago (Updated 3 weeks ago) • Actively hiring

Expires 6/11/2026

See Job Scorecard

Review key factors to help you decide if the role fits your goals.

How is this calculated?

Pay Growth

out of 5

Not enough data

Not enough info to score pay or growth

Job Security

out of 5

Not enough data

Calculating job security score...

Total Score

out of 100

Average of individual scores

Were these scores useful?

Skill Insights

Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.

Job Description

Data Engineer Data Engineer$82

85/hr

HybridTechnologyJob ID:

26-01183Last updated: 3 days ago

Location:

Rockville, MDSoftware Guidance & Assistance, Inc., (SGA), is searching for a Data Engineer for a

CONTRACT

assignment with one of our premier Regulatory clients in Rockville, MD or Tysons, VA.The Data Engineer works with moderate supervision across two equally weighted domains: (1) large-scale data pipeline development processing market events in a cloud environment, and (2) design and development of agentic AI systems including LLM-powered regulatory data assistants, MCP servers, and agent harness architectures. This position contributes to overall product quality throughout the software development lifecycle.

Responsibilities

Build and maintain ETL/ELT pipelines using Apache Spark, Hive, and Trino across S3-based data lake environments
Develop and optimize SQL for large-scale surveillance datasets including window functions, multi-table joins, and complex aggregations
Build and engineer big data systems (EMR-on-EC2, EMR-on-EKS) and develop solutions on analytical platforms (SageMaker, Domino, Dataiku)
Participate in data quality monitoring, anomaly detection, and production incident investigation
Develop AI agent systems using AWS Bedrock and agent frameworks (Strands Agents SDK, LangChain/LangGraph, or equivalent)
Build agent harness architectures combining LLM reasoning with deterministic execution
skill/RAG-based SQL generation and structured output validation
Implement agent memory, context management, and tool integration (MCP servers, API connectors, data catalog lookups) across the data lake
Build evaluation frameworks for agent accuracy
paraphrase robustness, routing precision, and structural consistency
Stay informed of advances in LLM frameworks (LangGraph, Google

ADK, AWS

Strands) and emerging AI capabilities

Write clean, well-tested code; contribute to CI/CD Jenkins pipelines and infrastructure-as-code on AWS
Ensure secure handling of RCI and sensitive regulatory data across both data pipelines and agent outputs
auditable execution traces
Ad to FINRA and team standards for secure development practices and technology policies
Partner across teams, communicate technical information at the appropriate level, and maintain documentation on Confluence/Wiki
Actively learn from senior team members; contribute to process improvement in line with FINRA's values of collaboration, expertise, innovation, and responsibility Required Skills Data Engineering & Big Data Technologies
Experience building data pipelines using Apache Spark (PySpark preferred) and SQL
Experience with SQL query engines (Hive, Trino/Presto, or similar) and cloud data platforms (AWS S3, EMR, Lambda)
Understanding of common issues like data skew and strategies to mitigate it, working with large data volumes, and troubleshooting job failures due to resource limitations, bad data, and scalability challenges
Real-world experience with debugging and mitigation strategies Generative AI & Agentic Systems
Practical experience building LLM-powered agent systems that use tools and produce structured outputs (not just chatbot interfaces)
Hands-on experience with at least one agent framework: LangChain, LangGraph, AWS Strands, or equivalent
Working knowledge of prompt engineering, RAG architectures, and context/memory management
Experience with foundation model APIs (Anthropic Claude, Amazon Nova, OpenAI, or similar)

Memory Architecture:

Understanding of agent memory tiers

working memory, episodic memory, semantic memory
and strategies for context persistence, pruning, and retrieval across sessions

Agent Harness Design:

Familiarity with harness patterns that wrap LLM reasoning with deterministic guardrails, tool routing, verification loops, and graceful degradation AI Tool Proficiency

Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)
Experience with spec-driven development
using structured specifications to guide AI code generation, review, and validation
Ability to leverage AI pair programming for code suggestions, debugging, refactoring, and automated test generation Cloud Technologies
Experience with AWS services like

S3, EMR, EMR

on EKS, Lambda, Bedrock, Step Functions, etc.

Hands-on experience using S3 with Spark (e.g., dealing with file formats, consistency issues)
Familiarity with AWS Bedrock for foundation model invocation, knowledge bases, guardrails, and agent orchestration
Exposure to Google Cloud Vertex AI (model garden, grounding, agent builder) or equivalent managed AI platforms
Familiarity with AWS monitoring and logging tools (CloudWatch, CloudTrail) for production workloads Programming
Python
Proficiency in Python for data engineering and automation
Ability to write clean, modular, and performant code
Experience with functional programming concepts (e.g., immutability, higher-order functions)
Strong understanding of collections, concurrency, and memory management SQL Skills (Window Functions, Joins, Complex Queries)
Proficiency with SQL window functions, multi-table joins, and aggregations
Ability to write and optimize complex SQL queries
Experience handling edge cases like NULLs, duplicates, and ordering Good to Have
AWS Bedrock AgentCore (memory, identity, tool gateway)
Model Context Protocol (MCP) server development and integration
Agent evaluation harnesses and agentic patterns (draft-verification, compile-style generation)
Fine-tuning foundation models for domain-specific tasks (LoRA, PEFT, or managed fine-tuning via Bedrock/Vertex AI)
Local model execution with Ollama, vLLM, or similar for development and experimentation
Vector databases (FAISS, Pinecone, OpenSearch)
Docker, Kubernetes, and Amazon EKS for containerized workloads
Infrastructure as Code (Terraform, CloudFormation)
Experience with CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, ArgoCD)
Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack)
AWS certifications (AI Practitioner, Solutions Architect, or Kubernetes certifications like

CKA/CKAD

) Education / Experience Requirements

Bachelor's degree in Computer Science, Data Science, Information Systems, or related discipline with at least two (2) years of related experience; or equivalent training and/or work ex.

..Visit the Employer site for more details