Platzhalter Bild

Audio AI Engineer bei Propio

Propio · Overland Park, Vereinigte Staaten Von Amerika · Onsite

Jetzt bewerben

Description


Propio is on a mission to make communication accessible to everyone. As a leader in real-time interpretation and multilingual language services, we connect people with the information they need across language, culture, and modality. We’re committed to building AI-powered tools to enhance interpreter workflows, automate multilingual insights, and scale communication quality across industries.


We are hiring an Audio AI Engineer that will develop and optimize end-to-end systems that enable real-time, high-fidelity speech-to-speech interpretation at Propio. This role focuses on seamlessly connecting speech recognition, translation, and synthesis technologies to create natural, low-latency interpretation experiences. 



Key Responsibilities:

  • Design and optimize end-to-end Speech-to-Speech pipelines that integrate ASR, translation, and TTS with minimal latency
  • Build bidirectional interpretation systems that handle turn-taking, speaker identification, and context preservation across language boundaries 
  • Collaborate with the Audio/Speech Engineer to optimize latency, quality, and robustness of speech components in the full pipeline
  • Work with the Staff ML Engineer to design efficient inference architectures and deployment strategies for real-time streaming systems
  • Develop streaming ASR and TTS systems capable of handling continuous, overlapping speech in interpretation scenarios
  • Benchmark and optimize latency across all pipeline stages (speech capture, recognition, translation, synthesis)
  • Integrate speaker diarization, acoustic environment adaptation, and speech enhancement into interpretation workflows
  • Partner with linguists and product teams to validate interpretation quality and gather domain-specific feedback

Requirements


Qualifications:

  • Bachelor's or Master’s Degree in Electrical Engineering, Computer Science, or related field
  • 3+ years of experience in speech processing, audio engineering, or conversational AI systems
  • Deep expertise in ASR, TTS, and streaming audio architectures
  • Proficiency in Python, ML frameworks, and experience with real-time signal processing 
  • Experience building low-latency production systems and optimizing for inference performance
  • Strong understanding of interpretation workflows, multilingual challenges, and speech quality metrics


Preferred Qualifications:

  • Experience building speech-to-text pipelines or hybrid ASR + LLM systems
  • Familiarity with real-time audio processing or latency-sensitive applications


#LI-JS1

Jetzt bewerben

Weitere Jobs