Software Engineer, AI/ML GenAI presso Instrumentl
Instrumentl · Remote, Stati Uniti d'America · Remote
What you will do
- Design agentic systems & ship AI to production: Turn prototypes into resilient, observable services with clear SLAs, rollback/fallback strategies, and cost/latency budgets. Build toolâusing LLM âagentsâ (task planning, function/tool calling, multiâstep workflows, guardrails) for tasks like grant discovery, application drafting, and research assistance.
- Own RAG endâtoâend: Ingest and normalize content, choose chunking/embedding strategies, implement hybrid retrieval, reâranking, citations, and grounding. Continuously improve recall/precision while managing index health.
- Manage embeddings at scale: Select, evaluate, and migrate embedding models; maintain vector stores (e.g., pgvector/FAISS/Pinecone/Weaviate/Milvus/Qdrant); monitor drift and rebuild strategies.
- Fineâtune & build evaluation: Run SFT/LoRA or instructionâtuning on curated datasets; evaluate the ROI vs. prompt engineering/model selection; manage data versioning and reproducibility. Create offline and online eval harnesses (helpfulness, groundedness, hallucination, toxicity, latency, cost), synthetic test sets, redâteaming, and humanâinâtheâloop review.Â
- Collaborate crossâfunctionally while raising engineering standards: Work side by side with Product, Design, and GTM on scoping, UX, and measurement; run experiments (A/B, canaries), interpret results, and iterate. Write clear, maintainable code, add tests and docs, and contribute to reliability practices (alerts, dashboards, incident response).
What we're looking for
- Software engineering background: 5+ years of professional software engineering experience, including 2+ years working with modern LLMs (as an IC). Startup experience and comfort operating in fast, scrappy environments is a plus.
- Proven production impact: Youâve taken LLM/RAG systems from prototype to production, owned reliability/observability, and iterated postâlaunch based on evals and user feedback.
- LLM agentic systems: Experience building tool/functionâcalling workflows, planning/execution loops, and safe tool integrations (e.g., with LangChain/LangGraph, LlamaIndex, Semantic Kernel, or custom orchestration).
- RAG expertise: Strong grasp of document ingestion, chunking/windowing, embeddings, hybrid search (keyword + vector), reâranking, and grounded citations. Experience with reârankers/crossâencoders, hybrid retrieval tuning, or search/recommendation systems.
- Embeddings & vector stores: Handsâon with embedding model selection/versioning and vector DBs (e.g., pgvector, FAISS, Pinecone, Weaviate, Milvus, Qdrant).IDocument processing at scale (PDF parsing/OCR), structured extraction with JSON schemas, and schemaâguided generation.
- Evaluation mindset: Comfort designing eval suites (RAG/QA, extraction, summarization), using automated and humanâinâtheâloop methods; familiarity with frameworks like Ragas/DeepEval/OpenAI Evals or equivalent.
- Infrastructure & languages: Proficiency in Python (FastAPI, Celery) and TypeScript/Node; familiarity with Ruby on Rails (our core platform) or willingness to learn. Experience with AWS/GCP, Docker, CI/CD, and observability (logs/metrics/traces).
- Data chops: Comfortable with SQL, schema design, and building/maintaining data pipelines that power retrieval and evaluation.
- Collaborative approach: You thrive in a crossâfunctional environment and can translate researchy ideas into shippable, userâfriendly features.
- Resultsâdriven: Bias for action and ownership with an eye for speed, quality, and simplicity.
Nice to have
- Fineâtuning: Practical experience with SFT/LoRA or instructionâtuning (and good intuition for when fineâtuning vs. prompting vs. model choice is the right lever).
- Exposure to openâsource LLMs (e.g., Llama) and providers (e.g., OpenAI, Anthropic, Google, Mistral).
- Familiarity with responsible AI, redâteaming, and domainâspecific safety policies.
Compensation & Benefits
- Salary ranges are based on market data, relative to our size, industry, and stage of growth. Salary is one part of total compensation, which also includes equity, perks, and competitive benefits.Â
- For US-based candidates, our target salary band is $175,000 - $220,000/year + equity. Salary decisions will be based on multiple factors including geographic location, qualifications for the role, skillset, proficiency, and experience level.Â
- 100% covered health, dental, and vision insurance for employees, 50% for dependents
- Generous PTO policy, including parental leave
- 401(k)
- Company laptop + stipend to set up your home workstation
- Company retreats for in-person time with your colleagues
- Work with awesome nonprofits around the US. We partner with incredible organizations doing meaningful work, and you get to help power their success.
Candidarsi ora