Firmenlogo

Hybrid ML Engineer - Distributed Training ML Engineer - Distributed Training with verification

Gensyn  ·  nan, Stati Uniti d'America · Hybrid

Candidarsi ora

About the job

Machine intelligence will soon take over humanity’s role in knowledge-keeping and creation. What started in the mid-1990s as the gradual off-loading of knowledge and decision making to search engines will be rapidly replaced by vast neural networks - with all knowledge compressed into their artificial neurons. Unlike organic life, machine intelligence, built within silicon, needs protocols to coordinate and grow. And, like nature, these protocols should be open, permissionless, and neutral. Starting with compute hardware, the Gensyn protocol networks together the core resources required for machine intelligence to flourish alongside human intelligence.

As a Machine Learning Engineer at Gensyn, your responsibilities would see you:

  • Productionise advanced ML parallelisation and verification frameworks
  • Convert novel hybrid parallelisation and verification research into production code

Competencies

Must have

  • Comfortable working in environments with a heavy research component - and the resulting uncertainty and trade-offs it creates
  • Demonstrable production experience working with modern parallelisation frameworks for training (e.g. FSDP, Megatron-LM, DeepSpeed) or frameworks for production scale inference (e.g. ONNX Runtime, TensorRT, DeepSpeed-Inference, NVIDIA Triton, and TorchServe)
  • Deep theoretical knowledge of deep learning or distributed systems

Should have

  • Experience working in high-growth start/scale-up environments
  • Deep understanding of, and experience with, common networking protocols (IP, TCP, UDP, HTTP) and communication backends (NCCL, GLOO, MPI)

Nice to have

  • Experience with compiler design
  • Strong systems programming experience (especially Rust)
  • Meaningful exposure to decentralised communication, distributed consensus, blockchains

Compensation / Benefits

  • Competitive salary + share of equity and token pool
  • Fully remote work - we hire between the West Coast (PT) and Central Europe (CET) time zones
  • Relocation Assistance - available for those that would like to relocate after being hired (anywhere from PST through CET time zones)
  • 4x all expenses paid company retreats around the world, per year
  • Whatever equipment you need
  • Paid sick leave
  • Private health, vision, and dental insurance - including spouse/dependents [🇺🇸 only]

Our Principles

Autonomy

  • Don’t ask for permission - we have a constraint culture, not a permission culture
  • Claim ownership of any work stream and set its goals/deadlines, rather than waiting to be assigned work or relying on job specs
  • Push & pull context on your work rather than waiting for information from others and assuming people know what you’re doing
  • No middle managers - we don’t (and will likely never) have middle managers

Focus

  • Small team - misalignment and politics scale super-linearly with team size. Small protocol teams rival much larger traditional teams
  • Thin protocol - build and design thinly
  • Reject waste - guard the company’s time, rather than wasting it in meetings without clear purpose/focus, or bikeshedding

Reject mediocrity

  • Give direct feedback to everyone immediately rather than avoiding unpopularity, expecting things to improve naturally, or trading short-term pain for extreme long-term pain
  • Embrace an extreme learning rate rather than assuming limits to your ability/knowledge
  • No quit - push to the final outcome, despite any barriers

Candidarsi ora

Altri lavori