Firmenlogo

Senior Data Engineer (F/M/D) presso Animore

Animore · Munich, Germania · On-site

Candidarsi ora

Description

The Opportunity

We’re looking for a Senior Data Engineer to architect and scale the data backbone powering next-generation AI models in robotics and real-world environments.

This role sits at the intersection of distributed systems, multimodal data processing, and applied machine learning, with a strong focus on building high-quality datasets for robotic foundation models. You will ensure that data pipelines, infrastructure, and data strategy directly translate into measurable improvements in model performance.

Your Responsibilities

  • Drive the model–data loop by connecting application requirements with data collection, and translating model failures into data-driven improvements through collection, curation, and augmentation
  • Build and scale distributed data pipelines (Ray/Anyscale or similar) for TB-scale video, sensor, and robotics datasets
  • Design multimodal data schemas aligning video, actions, and high-frequency sensor streams
  • Develop Python tooling for data quality, including cleaning, anomaly detection, and dataset versioning
  • Own dataset quality and coverage, including annotation workflows, data diversity, and storage trade-offs
  • Lead a small team and coordinate with data providers and annotation vendors
  • Oversee real-world data collection, including technical setup, compliance, and secure data handling

Technologies

  • Python (advanced, production-grade)
  • Ray / Anyscale or Apache Spark
  • AWS / GCP for large-scale data and GPU training pipelines
  • Video and sensor data formats (H.264/H.265, ROS bags, MCAP)
  • PyTorch, NumPy
  • DVC, LakeFS or similar data versioning tools
  • Distributed data processing and storage systems

Requirements

Must Have

    • 5+ years in Data/ML Engineering, including 2+ years in a senior or lead role
    • Experience with large-scale real-world data (robotics, autonomous systems, or video AI)
    • Strong experience with Ray/Anyscale or Spark for distributed pipelines
    • Advanced Python (performance, concurrency, ML stack like NumPy/PyTorch)
    • Experience working with video and sensor data formats (e.g., H.264/H.265, ROS bags, MCAP)
    • Experience building scalable data pipelines for GPU-based training workloads (AWS/GCP)
    • Experience with data versioning tools such as DVC or LakeFS
    • Proven experience owning systems and mentoring engineers

Nice to Have

    • Experience building datasets for multimodal foundation models (VLA, VLM or similar)
    • Robotics fundamentals (sensor synchronization, 3D transforms)
    • Experience with active learning or data-centric ML workflows

Benefits

  • Competitive compensation package
  • Various employee subsidies and perks, including public transportation and Wellpass
  • Work with a world-class team in a flat hierarchy, with direct collaboration alongside the founders and engineering team
  • Opportunity to make a real impact by working on cutting-edge robotics and AI systems
  • Fast growth potential in a rapidly evolving company and industry
  • International office environment with English as the official working language

Recruiting Process

Your recruiting partner for this role is Madhulika (she/her). You can expect a screening call and up to 4 rounds of interviews including an onsite visit to our office in Munich to meet with the team.

We hire across backgrounds, identities, and experiences, and we are committed to a workplace where everyone belongs. Discrimination has no place here.

If you need any accommodations during the recruiting process, just reach out to your recruiting partner.

Candidarsi ora

Altri lavori