Firmenlogo

Senior Data Engineer (F/M/D) at Animore

Animore · Munich, Germany · On-site

Apply Now

Description

The Opportunity

We’re looking for a Senior Data Engineer to architect and scale the data backbone powering next-generation AI models in robotics and real-world environments.

This role sits at the intersection of distributed systems, multimodal data processing, and applied machine learning, with a strong focus on building high-quality datasets for robotic foundation models. You will ensure that data pipelines, infrastructure, and data strategy directly translate into measurable improvements in model performance.

Your Responsibilities

  • Drive the model–data loop by connecting application requirements with data collection, and translating model failures into data-driven improvements through collection, curation, and augmentation
  • Build and scale distributed data pipelines (Ray/Anyscale or similar) for TB-scale video, sensor, and robotics datasets
  • Design multimodal data schemas aligning video, actions, and high-frequency sensor streams
  • Develop Python tooling for data quality, including cleaning, anomaly detection, and dataset versioning
  • Own dataset quality and coverage, including annotation workflows, data diversity, and storage trade-offs
  • Lead a small team and coordinate with data providers and annotation vendors
  • Oversee real-world data collection, including technical setup, compliance, and secure data handling

Technologies

  • Python (advanced, production-grade)
  • Ray / Anyscale or Apache Spark
  • AWS / GCP for large-scale data and GPU training pipelines
  • Video and sensor data formats (H.264/H.265, ROS bags, MCAP)
  • PyTorch, NumPy
  • DVC, LakeFS or similar data versioning tools
  • Distributed data processing and storage systems

Requirements

Must Have

    • 5+ years in Data/ML Engineering, including 2+ years in a senior or lead role
    • Experience with large-scale real-world data (robotics, autonomous systems, or video AI)
    • Strong experience with Ray/Anyscale or Spark for distributed pipelines
    • Advanced Python (performance, concurrency, ML stack like NumPy/PyTorch)
    • Experience working with video and sensor data formats (e.g., H.264/H.265, ROS bags, MCAP)
    • Experience building scalable data pipelines for GPU-based training workloads (AWS/GCP)
    • Experience with data versioning tools such as DVC or LakeFS
    • Proven experience owning systems and mentoring engineers

Nice to Have

    • Experience building datasets for multimodal foundation models (VLA, VLM or similar)
    • Robotics fundamentals (sensor synchronization, 3D transforms)
    • Experience with active learning or data-centric ML workflows

Benefits

  • Competitive compensation package
  • Various employee subsidies and perks, including public transportation and Wellpass
  • Work with a world-class team in a flat hierarchy, with direct collaboration alongside the founders and engineering team
  • Opportunity to make a real impact by working on cutting-edge robotics and AI systems
  • Fast growth potential in a rapidly evolving company and industry
  • International office environment with English as the official working language

Recruiting Process

Your recruiting partner for this role is Madhulika (she/her). You can expect a screening call and up to 4 rounds of interviews including an onsite visit to our office in Munich to meet with the team.

We hire across backgrounds, identities, and experiences, and we are committed to a workplace where everyone belongs. Discrimination has no place here.

If you need any accommodations during the recruiting process, just reach out to your recruiting partner.

Apply Now

Other home office and work from home jobs