About the job Real-Time Audio Systems Engineer (Real-Time Browser Audio)
Job Description Real-Time Audio Systems Engineer (Real-Time Browser Audio)
We are seeking a highly skilled Real-Time Audio Systems Engineer with strong expertise in browser-based audio streaming, WebSockets, and real-time voice bot architecture. The ideal candidate will design and implement low-latency, full-duplex audio pipelines that run directly in the browser and integrate with modern STT, LLM, and TTS systems.
Responsibilities
Build real-time browser audio clients using Web Audio API, MediaRecorder, AudioContext, AudioWorklets, and WebSockets.
Implement bi-directional audio streaming (microphone backend speaker) with <500ms latency targets.
Integrate streaming STT (AssemblyAI, OpenAI Realtime, Transcribe, or similar).
Implement streaming TTS playback, including chunked audio decoding and buffered playback.
Develop backend streaming servers (Node.js/Python) for handling audio frames, STT events, LLM routing, and TTS responses.
Build and optimize wake-word workflows and VAD (voice activity detection).
Handle audio formats & codecs (PCM, Opus, WAV) and real-time transcoding.
Ensure reliability: reconnection logic, jitter buffers, packet timing, audio frame synchronization.
Requirements
Strong experience with WebSockets and browser-based real-time audio.
Deep understanding of Web Audio API, audio buffers, sample rates, streams, and playback pipelines.
Hands-on experience with streaming STT/TTS and integrating LLMs.
Proficiency in Node.js or Python for real-time backend services.
Knowledge of DSP basics (noise suppression, VAD), latency optimization, and audio pipeline debugging.
Nice to Have
Experience with WebRTC audio.
Familiarity with wake-word engines (Picovoice, Porcupine).
Experience building Alexa/Siri-like conversational agents.