- Professional
- Ufficio in Reston
We are seeking a Data Engineer with hands-on AI/ML project experience in Databricks to join the Databricks Solution Team within the IRS Advanced Analytics Program (AAP). This role is responsible for building, optimizing, and maintaining data pipelines and feature engineering workflows that directly support model training, deployment, and monitoring for IRS mission teams.
As part of the AAP common services mission, the Data Engineer will deliver scalable, reusable, and compliant data engineering solutions using Databricks and AWS. The ideal candidate brings a strong background in data engineering for AI/ML use cases — ensuring data readiness and accessibility across the entire AI/ML lifecycle.
Key Responsibilities
- Design, build, and maintain data pipelines in Databricks (Spark, Delta Lake, MLflow) specifically tailored for AI/ML and GenAI use cases.
- Implement data ingestion, transformation, and feature engineering workflows that feed model training and inference processes.
- Collaborate with mission data scientists to ensure datasets are optimized for model development and experimentation.
- Integrate pipelines into CI/CD workflows for automated, repeatable, and compliant model operations.
- Optimize data workflows for performance, scalability, and cost-efficiency across multi-tenant workloads.
- Apply governance and security controls (Unity Catalog, IAM, audit logging) to protect sensitive IRS data.
- Support data validation, schema enforcement, and quality checks to ensure reliable model outcomes.
- Partner with Product Manager and Chief Architect to align data engineering capabilities with roadmap priorities and platform evolution.
Required Qualifications
- Bachelor’s degree in Computer Science, Data Engineering, or related field and 14 years or more experience; Master's degree at 12 or more years experience.
- Must be a U.S. Citizen with the ability to obtain and maintain a Public Trust security clearance.
- 5+ years of data engineering experience with AI/ML-focused projects.
- Hands-on expertise with Databricks, Spark, Delta Lake, and MLflow in the context of AI/ML pipelines.
- Proficiency in Python, SQL, and data transformation frameworks.
- Experience delivering feature engineering and data prep for model development and operationalization.
- Familiarity with ETL orchestration tools (Airflow, Databricks Workflows, or similar).
- Knowledge of CI/CD integration for data pipelines (Terraform, Git-based workflows).
- Awareness of AI/ML lifecycle data needs (training, validation, inference, retraining).
Desired Skills
- Certifications: Databricks Certified Data Engineer Associate/Professional.
- Experience in federal or regulated data environments (FedRAMP, NIST 800-53).
- Familiarity with AWS data services (S3, Glue, Lambda, Redshift) integrated with Databricks.
- Exposure to Trustworthy AI practices (bias monitoring, lineage, explainability).
- Strong problem-solving and collaboration skills with architects, MLOps engineers, and mission data scientists.
*!SAIC accepts applications on an ongoing basis and there is no deadline.
SAIC® is a premier Fortune 500® mission integrator focused on advancing the power of technology and innovation to serve and protect our world. Our robust portfolio of offerings across the defense, space, civilian and intelligence markets includes secure high-end solutions in mission IT, enterprise IT, engineering services and professional services. We integrate emerging technology, rapidly and securely, into mission critical operations that modernize and enable critical national imperatives.
We are approximately 24,000 strong; driven by mission, united by purpose, and inspired by opportunities. SAIC is an Equal Opportunity Employer. Headquartered in Reston, Virginia, SAIC has annual revenues of approximately $7.5 billion. For more information, visit saic.com. For ongoing news, please visit our newsroom. Candidarsi ora