- Professional
- Ufficio in Overland Park
Description
Propio is on a mission to make communication accessible to everyone. As a leader in real-time interpretation and multilingual language services, we connect people with the information they need across language, culture, and modality. We’re committed to building AI-powered tools to enhance interpreter workflows, automate multilingual insights, and scale communication quality across industries.
We are hiring an Audio AI Engineer that will develop and optimize end-to-end systems that enable real-time, high-fidelity speech-to-speech interpretation at Propio. This role focuses on seamlessly connecting speech recognition, translation, and synthesis technologies to create natural, low-latency interpretation experiences.
Key Responsibilities:
- Design and optimize end-to-end Speech-to-Speech pipelines that integrate ASR, translation, and TTS with minimal latency
- Build bidirectional interpretation systems that handle turn-taking, speaker identification, and context preservation across language boundaries
- Collaborate with the Audio/Speech Engineer to optimize latency, quality, and robustness of speech components in the full pipeline
- Work with the Staff ML Engineer to design efficient inference architectures and deployment strategies for real-time streaming systems
- Develop streaming ASR and TTS systems capable of handling continuous, overlapping speech in interpretation scenarios
- Benchmark and optimize latency across all pipeline stages (speech capture, recognition, translation, synthesis)
- Integrate speaker diarization, acoustic environment adaptation, and speech enhancement into interpretation workflows
- Partner with linguists and product teams to validate interpretation quality and gather domain-specific feedback
Requirements
Qualifications:
- Bachelor's or Master’s Degree in Electrical Engineering, Computer Science, or related field
- 3+ years of experience in speech processing, audio engineering, or conversational AI systems
- Deep expertise in ASR, TTS, and streaming audio architectures
- Proficiency in Python, ML frameworks, and experience with real-time signal processing
- Experience building low-latency production systems and optimizing for inference performance
- Strong understanding of interpretation workflows, multilingual challenges, and speech quality metrics
Preferred Qualifications:
- Experience building speech-to-text pipelines or hybrid ASR + LLM systems
- Familiarity with real-time audio processing or latency-sensitive applications
#LI-JS1
Candidarsi ora