Member of technical staff (Agent) - Reinforcement Learning bei H Company

H Company · Paris, Frankreich · Hybrid

2025-08-28 14:00:00.0

Python

Senior
Optionales Büro in Paris

About H:
H exists to push the boundaries of superintelligence with agentic AI. By automating complex, multi-step tasks typically performed by humans, AI agents will help unlock full human potential.

H is hiring the world’s best AI talent, seeking those who are dedicated as much to building safely and responsibly as to advancing disruptive agentic capabilities. We promote a mindset of openness, learning, and collaboration, where everyone has something to contribute.

About the Team: The Agent team defines new learning algorithms and agent paradigms to push the frontiers of agentic systems. We build upon foundation models and reinforcement learning to develop new approaches to train artificial general agents and work closely with the LLM/VLM and Safety teams to explore new directions.

This is a heavily engineering-focused role embedded within the research team. You will be responsible for defining the architecture and building the robust, scalable systems that underpin our research efforts. Your work will translate cutting-edge research concepts into high-performance, production-quality platforms, enabling the next generation of agentic AI.

Key Responsibilities:

Design and implement novel Deep Reinforcement Learning algorithms to train large-scale agents.
Propose and lead new research directions that combine state-of-the-art RL with foundation models (LLMs/VLMs).
Develop sophisticated reward models and training environments to guide agent learning on complex, open-ended tasks.
Create and manage massive benchmarks to rigorously evaluate and track agent capabilities, running comprehensive evaluation campaigns.

Requirements:

Technical/Research Skills:
- A PhD in Machine Learning, Computer Science, or a related field with a strong publication record (e.g., NeurIPS, ICML, ICLR) specifically in Reinforcement Learning.
- Deep theoretical and practical expertise in modern Deep RL, including on-policy/off-policy algorithms, reward modeling, and exploration strategies.
- Proven experience implementing and scaling complex RL algorithms from scratch in a major deep learning framework (PyTorch, JAX, or TensorFlow).
- Proficient in Python & Git.
Soft Skills:
- Enjoys collaboration and thrives in a teamwork-oriented research environment.
- Impactful communication skills, with the ability to clearly articulate complex research ideas.
- Genuinely eager to explore the new challenges at the frontier of agentic AI.
Bonuses
- Practical experience applying RL to systems built on Large Language Models (LLMs).
- Familiarity with building complex simulation environments for agent training.

Location

H's teams are distributed throughout France and the UK.
This role has the potential to be fully remote or hybrid for candidates based in cities where we have an office—currently Paris and London.

What We Offer:

Join the exciting journey of shaping the future of AI, and be part of the early days of one of the hottest AI startups.
Collaborate with a fun, dynamic, and multicultural team, working alongside world-class AI talent in a highly collaborative environment.
Enjoy a competitive salary.
Unlock opportunities for professional growth, continuous learning, and career development.

If you want to change the status quo in AI, join us.

Jetzt bewerben

Member of technical staff (Agent) - Reinforcement Learning bei H Company

Zusätzliche Nebenleistungen

Weitere Jobs

Product Manager, Context & Search

AI Deployment Strategist - Marseille

Associate Account Manager (Paris)

Job suchen

Menü

Sprache wählen

Sich anmelden

Cookie Einstellungen

Cookie Einstellungen

Zielgruppenorientierte Cookies

Wir benutzen Cookies

Member of technical staff (Agent) - Reinforcement Learning bei H Company

Zusätzliche Nebenleistungen

Weitere Jobs

Product Manager, Context & Search

AI Deployment Strategist - Marseille

Associate Account Manager (Paris)

Job suchen

Die neusten Homeoffice Jobs wöchentlich per E-Mail.

Menü

Sprache wählen

Sich anmelden

Cookie Einstellungen

Cookie Einstellungen

Zielgruppenorientierte Cookies

Wir benutzen Cookies

Die neusten Homeoffice Jobs
wöchentlich per E-Mail.