Job Description
Platform Architect at UsefulBI Corporation Platform Architect at UsefulBI Corporation in Oakland, California Posted in 4 days ago.
Type:
full-time Job Description:
ABOUT USEFULBI
UsefulBI is a global AI and data transformation partner helping enterprises turn data and technology into competitive advantage. With 600+ successful projects and AWS Generative AI Competency recognition, we serve Fortune 500 clients across Life Sciences, Financial Services, and Technology. ROLE OVERVIEW
We are seeking an experienced Data Platform Architect to lead design and implementation of an enterprise grade, AI-ready data platform - spanning cloud storage, data engineering pipelines, governance-ready data products, and GenAI / RAG
workloads. The ideal candidate brings deep expertise with AWS-native services, Databricks, and modern data stack tools, with a strategic mindset to architect platforms delivering governed, AI consumable datasets at scale. KEY RESPONSIBILITIES 1.
Storage & Platform Foundation Architect the AWS Data Platform:
Amazon S3, Athena, Redshift, and AWS Lake Formation for governed access Design and operate Databricks Lakehouse (Delta Lake, Unity Catalog, Databricks Compute) Ensure secure, scalable, and cost-optimized foundational infrastructure aligned to enterprise SLAs 2. Processing & Engineering Layer Lead data engineering pipelines using dbt (transformations & modeling) and Databricks (ML Runtime, Notebooks) Implement Data Quality & Testing using SODA; design Workflow Orchestration with monitoring and alerting Drive a shift-left engineering culture with reusable, well-documented pipelines. 3. Data Products Layer - AI-Ready Datasets Define and build domain-owned, AI-ready Data Products (Customer 360, Patient 360, Clinical, Commercial, Feature-Ready, and RAG-Ready datasets) Champion a Data Mesh / Data Product mindset with clear ownership, SLAs, and discoverability 4. Governance & Catalog Layer Implement unified data governance using Atlan (Governance & Catalog) for discoverability, lineage, trust, and control Establish end-to-end Data Lineage tracking (data to production) using classification tagging Define and enforce Policies & Access controls: RBAC/ABAC, PII/PHI
tagging, sensitive data classification, and policy enforcement Drive AI Governance practices: AI data usage policies, model lineage, bias detection, and audit readiness Ensure Quality & Trust standards: data quality SLAs, certification workflows, and DQ dashboards 5. AI & Advanced Analytics Enablement Architect data infrastructure supporting ML Models (predictive, prescriptive, optimization) and GenAI/RAG pipelines Design pipelines for LLM apps, copilots, and knowledge search; support real-time Decision Systems Collaborate with domain teams to build Domain-Owned AI & ML models on governed, trusted data 6. Cross-Cutting Capabilities Security & Compliance:
End-to-end data security, IAM, encryption, and audit-ready regulatory compliance (HIPAA, GDPR, SOC 2) Monitoring & Observability:
Platform-level data and model observability, pipeline health dashboards Collaboration:
Tools and practices to connect people, knowledge, and data across the organization REQUIRED SKILLS & QUALIFICATIONS
Cloud & Storage 10+ years of experience in data architecture and cloud platforms Deep expertise with AWS services: S3, Athena, Redshift, Lake Formation, Glue, IAM Proficiency in Databricks:
Delta Lake, Unity Catalog, Databricks Workflows, MLflow Experience with data lakehouse architecture patterns and medallion architecture (Bronze / Silver / Gold) Data Engineering & Transformation Strong hands-on expertise with dbt (data build tool) for SQL-based transformations and data modeling Pipeline orchestration experience: Apache Airflow, Databricks Workflows, or equivalent Data quality tooling: SODA, Great Expectations, or Monte Carlo Proficiency in Python, SQL, and Spark Data Governance & Catalog Experience implementing data catalogs and governance tools (Atlan, Collibra, Alation, or equivalent) Knowledge of data lineage, metadata management, classification frameworks Understanding of RBAC/ABAC, PII/PHI
regulations, and policy enforcement at the platform level AI & GenAI Readiness Experience building Feature Stores and ML-ready datasets for model training and inference. Familiarity with RAG (Retrieval-Augmented Generation) architecture: chunking, embedding generation, vector stores Understanding of LLMOps, model observability, and AI governance frameworks Experience with vector databases (Pinecone, Weaviate, pgvector, or Databricks Vector Search) Architecture & Leadership Proven ability to design and document enterprise data architecture using frameworks like TOGAF or Zachman Strong stakeholder engagement skills - ability to translate business needs into technical architecture Experience working in regulated industries (Life Sciences, Financial Services, Healthcare) is a strong plus AWS Certified Solutions Architect (Professional) or Databricks Certified Professional preferred NICE TO HAVE
Experience with ThoughtSpot, Tableau, or Power BI for self-service BI layer integration Familiarity with SAS Viya or clinical analytics platforms Knowledge of FHIR, HL7, or CDISC data standards for Life Sciences Contributions to open-source data engineering projects Experience with multi-cloud or hybrid-cloud architectures WHAT WE OFFER
Opportunity to architect one of the most comprehensive AI-ready data platforms in the industry Exposure to Fortune 500 clients across Life Sciences, Financial Services, and Technology AWS Generative AI Competency partner - work on cutting-edge GenAI and LLM use cases Collaborative, innovation-first culture with clear career growth path Competitive compensation, flexible work arrangements (remote/hybrid available) Access to certifications, training, and industry conferences Comprehensive Benefits:
Medical, Dental, and Vision insurance coverage for employees and eligible dependents. Retirement Benefits:
401(k) retirement savings plan with company benefits as per policy.