Applied Machine Learning Engineer Audio & Speech

San Francisco, California, United States Confidential Search

Job Openings Applied Machine Learning Engineer Audio & Speech

About the job Applied Machine Learning Engineer Audio & Speech

Applied Machine Learning Engineer Audio & Speech
Location: San Francisco or New York (On-site preferred)
Compensation: $200,000 $250,000 + Competitive Equity
Employment Type: Full-time, Permanent
Visa Sponsorship: Available

Who are we?

Were an early-stage, high-growth startup at the intersection of AI and audio. Our mission is to bring AI into the real world starting with sound. Audio is the most human, accessible, and versatile input and we believe its the gateway to truly intelligent systems. Our team is pioneering a new category by applying rigorous R&D practices to the creation of high-quality audio datasets that fuel next-generation models.

In less than a year, we've secured partnerships with top AI labs and most of FAANG, raised $5M+ from leading investors, and built a team of ambitious, humble, and mission-driven builders. Now were hiring applied ML engineers to help us unlock the full potential of audio AI.

What's in it for you?

Build generative AI models that directly advance the field of speech and audio
Full ownership over ML pipelines from research to real-world deployment
Work with a tight-knit team of former AI infrastructure builders from top companies
Rapid career growth at one of the fastest-moving companies in a booming space
Competitive salary + equity
100% paid health, dental, and vision insurance
Daily paid meals via DoorDash
Flexible PTO + 401k access
Visa sponsorship available

What will you do?

Research and develop advanced ML models for speech and audio use cases
Translate cutting-edge research into high-quality production Python code
Own the full ML pipeline from POC to deployment in cloud environments (AWS or GCP)
Build APIs and inference systems that surface insights from high-quality data
Work cross-functionally with Operations to source, shape, and evaluate training datasets
Architect resilient infrastructure for model training, evaluation, and production inference
Help define the company's ML roadmap and lead investments in research and tooling

What will you need?

5+ years of experience in machine learning, with at least 2 years in DSP/audio ML
Demonstrated experience shipping ML systems end-to-end (not just research)
Strong Python skills and proficiency in deep learning frameworks like PyTorch
Expertise in cloud environments (AWS or GCP) for training and deploying models
Ability to think holistically about model performance, UX, and business value
Experience in speech or audio research and/or production

Bonus points for:

Experience training generative AI models
PhD or Masters in Computer Science, ML, or a related field
Leading ML teams and driving technical direction
Experience building classical and ML-based audio signal processing systems
Startup experience, particularly in zero-to-one product development

Were hiring two engineers for this role so if you're passionate about audio, ML, and making a real-world impact, wed love to hear from you. You'll be joining a team that values speed, ownership, technical excellence, and above all, building things that matter.

Or refer someone