Platzhalter Bild

Research Engineer en Ashby HQ

Ashby HQ · San Francisco, Estados Unidos De América · Onsite

Solicitar ahora

Responsibilities:

  • Design and maintain evaluation frameworks that measure AI output quality across all experiences, developing metrics and benchmarks to assess model performance

  • Systematically improve production prompts through iterative experimentation—diagnosing failure patterns, crafting targeted improvements, and validating against quality benchmarks

  • Fine-tune models on targeted datasets to improve baseline performance (e.g., preventing poor layout choices, improving outline quality)

  • Conduct rigorous experiments to understand model behavior, analyze results, and derive insights that inform prompt and model improvements

  • Build tools and workflows to support rapid experimentation and quality analysis, enabling faster iteration on AI improvements

Qualifications:

  • 3+ years working with AI systems with demonstrated experience in shipping production grade AI products

  • Deep hands-on experience with prompt engineering, LLM experimentation, and systematic evaluation of AI outputs

  • Strong experimental mindset with ability to design tests, analyze model performance, and iterate toward quality improvements

  • Experience post-training LLMs (RL, SFT, etc)

  • Research-oriented approach to problem-solving; comfortable working in ambiguity and exploring novel solutions to AI quality challenges

  • Exceptional attention to detail and quality obsession—cares deeply about output quality across all dimensions, including less visible aspects

  • Bachelor's degree in Computer Science, ML, or related field (or equivalent hands-on experience with AI research/experimentation)

Solicitar ahora

Otros empleos