- Professional
- Ufficio in Arlington
ECS is seeking a MLOps Integration Engineer to work in our Arlington, VA office.
Job Summary:
We are seeking an experienced MLOps Integration Engineer to design, deploy, and optimize machine learning pipelines supporting the secure, reliable, and efficient operation of AI models in production. The MLOps Integration Engineer will lead the automation of end-to-end ML workflows—from model deployment and versioning to monitoring, drift detection, and compliance logging. This role focuses on building scalable infrastructure and observability frameworks that ensure models remain performant, traceable, and aligned with mission and business objectives across cloud and on-premises environments.
Responsibilities:
- Deploy and manage ML models in production using tools such as MLflow, Kubeflow, or AWS SageMaker, ensuring scalability, low latency, and availability.
- Design and maintain dashboards using Grafana, Prometheus, or Kibana to track real-time and historical model performance metrics (e.g., accuracy, latency, throughput).
- Build automated pipelines using tools like Evidently AI or Alibi Detect to identify data distribution shifts and initiate retraining or alerting mechanisms.
- Implement centralized logging with ELK Stack or OpenTelemetry to capture inference events, system errors, and audit trails for debugging, compliance, and model governance.
- Develop CI/CD pipelines using GitHub Actions, Jenkins, or Azure DevOps to automate model builds, testing, deployment, and rollback.
- Apply secure-by-design principles to safeguard AI pipelines through encryption, access control, and compliance with frameworks such as GDPR, HIPAA, and NIST AI RMF.
- Partner with data scientists, AI engineers, DevOps, and security teams to ensure seamless model integration and lifecycle management.
- Optimize model inference performance through techniques such as quantization, pruning, and container orchestration for efficient resource utilization across AWS, Azure, or Google Cloud.
- Develop comprehensive documentation for ML pipelines, observability configurations, and monitoring workflows to promote operational transparency and knowledge sharing.