3Pillar is an AI transformation partner on a mission to help enterprises build the AI-native products and intelligent agents that will define the next era of business. With teams across North America, Europe, Latin America, and Asia, we work with the most ambitious companies in financial services, healthcare, media, and technology — helping them move faster, modernize boldly, and compete on their own terms. Our HelixAI platform and Helix Pods delivery model put our engineers at the center of real agentic transformation — doing work that is open, portable, and built to last. We are building the future of enterprise AI
We are looking Lead Data Engineer to build, operate, and continuously improve the data pipelines, retrieval infrastructure, and ML/LLMOps foundations that power our AI initiatives. The resource will work on turning reference architectures and data contracts into robust, production-grade implementations that serve conversational AI assistants, dashboard copilots, autonomous agents, RAG applications, and predictive ML models.
Key Responsibilities:
Data Pipeline Engineering : Build, test, and maintain production pipelines (batch & real-time) on Snowflake, PySpark, Delta Lake, and Kafka.
Implement data quality checks, schema validation, and alerting at every pipeline stage.
Migrate legacy ETL/DWH to cloud-native AWS/Azure architectures with measurable latency and cost improvements.
Implement chunking, metadata filtering, and re ranking — tuning for precision, recall, and latency.
Maintain data freshness and index consistency; instrument with context relevance and faithfulness metrics.
Semantic Layer & Knowledge Infrastructure: Implement and maintain business entity mappings, ontologies, and knowledge graphs (Neo4j) per Architect design.
Build and version the feature store and semantic data contracts serving both ML models and LLM applications.
Manage metadata, data lineage, and audit trail instrumentation across the platform.
ML/LLMOps Pipeline Support: Build ML data infrastructure: training curation, feature engineering, MLflow experiment tracking, dataset versioning.
Support LLM fine-tuning workflows — corpus curation, quality filtering, dataset formatting.
Maintain text-to-SQL layers, semantic query interfaces, and context APIs for conversational AI consumers.
Governance, Security & Data Quality: Implement RBAC, attribute-based access, PII detection/masking, data classification, and audit logging.
Enforce data contracts and schema governance with automated breaking-change detection and versioned migrations.
Build data quality monitoring (completeness, freshness, consistency) with automated alerting and root-cause tooling.
Support compliance readiness: audit trails, data provenance, and regulatory documentation.
Qualifications:
7+ years data engineering using Cloud services
2+ years production AI/ML or LLM-era data infrastructure. Proven experience building production pipelines at scale — batch and streaming, Snowflake,AWS/Azure.
Deep expertise: Python, PySpark, Snowflake, Delta Lake, Kafka, Spark Structured Streaming.
Hands-on with vector stores, embedding pipelines, and retrieval infrastructure in production RAG environments.
Working knowledge of MLOps: MLflow, CI/CD for AI, automated evaluation, and production monitoring.
Strong grounding in data governance, quality frameworks, and compliance- aligned engineering.
Diese Cookies sind für das Funktionieren der Website erforderlich und können in unseren Systemen nicht abgeschaltet werden. Sie können Ihren Browser so einstellen, dass er diese Cookies blockiert, aber dann könnten einige Teile der Website nicht funktionieren.
Sicherheit
Benutzererfahrung
Zielgruppenorientierte Cookies
Diese Cookies werden über unsere Website von unseren Werbepartnern gesetzt. Sie können von diesen Unternehmen verwendet werden, um ein Profil Ihrer Interessen zu erstellen und Ihnen an anderer Stelle relevante Werbung zu zeigen.
Google Analytics
Google Ads
Wir benutzen Cookies
🍪
Unsere Website verwendet Cookies und ähnliche Technologien, um Inhalte zu personalisieren, das Nutzererlebnis zu optimieren und Werbung zu indvidualisieren und auszuwerten. Indem Sie auf Okay klicken oder eine Option in den Cookie-Einstellungen aktivieren, stimmen Sie dem zu.
Die besten Remote-Jobs per E-Mail
Schliess dich über 5'000+ Personen an, die wöchentlich Benachrichtigungen über Remote-Jobs erhalten!