Job Openings
Founding Engineer
About the job Founding Engineer
Were a bootstrapped, early-stage startup building the next generation of voice-first and vision-aware AI agents — systems that can hear, see, understand, and act. Our platform turns real-time voice, video, and multimodal inputs into intelligent workflows for mentorship, sales support, and human-in-the-loop operations.
We work at the intersection of voice AI (ElevenLabs, Whisper, Deepgram), vision models, agentic orchestration, and streaming infrastructure. If youre excited about creating AI agents that talk, listen, watch, and collaborate like real teammates, this is ground zero.
Youll define the technical direction, shape engineering culture, and help deliver a real product — not just prototypes — with real users.
What Youll Do
Build the Voice + Sight Agent Engine
- Architect the real-time agent OS that powers multimodal conversation, perception, and reasoning.
- Design scalable pipelines for voice synthesis, voice cloning, streaming ASR, multimodal perception, memory, and human-in-the-loop controls.
- Integrate best-in-class voice tech (e.g., ElevenLabs, Whisper) with vision models to enable rich, real-time agent interactions.
Build the Full-Stack Interface Where Agents Operate
- Create dashboards, live voice rooms, multimodal monitoring, and quick human-handoff tools.
- Build UIs for video understanding, event detection, transcript playback, and agent analytics.
Engineer Multimodal Creation & Action Pipelines
- Turn voice + video inputs into structured agent actions (alerts, insights, steps, recommendations).
- Build pipelines for script generation, voice messaging, video clipping, and contextual agent behaviors.
Deliver a Production-Ready Demo in 3 Months
- Build a working customer-facing demo that is:
- secure (auth, privacy, data isolation)
- scalable (real-time streaming, concurrency, monitoring)
- reliable (fault-tolerant pipelines, uptime guarantees)
- This demo will power customer pilots and early revenue.
- Youll own the architecture that ensures the platform can scale safely from day one.
Ship 01 Features — Fast
- Work in true greenfield mode — rapid cycles, fast decisions, no bureaucracy.
- Launch major agent capabilities weekly.
- Establish core engineering standards and long-term technical foundations.
Shape the Company From Day One
- Partner directly with founders on roadmap, architecture, and product vision.
- Help define culture, hiring philosophy, rituals, and engineering excellence.
- Co-founder title is possible for the right person.
What Were Looking For
Core Engineering Strengths
- 5–7 years of hands-on engineering experience
- Strong with TypeScript, Next.js, Node.js, Python, React, PostgreSQL
- Solid cloud experience (AWS preferred)
- Product-driven mindset with strong ownership
Voice / Vision / Agentic AI Skills (Any of These)
- Integrating ElevenLabs, Whisper, Deepgram, or real-time voice systems
- Working with vision models (OpenAI Realtime, CLIP, video/image understanding)
- Building LLM-driven or agentic frameworks
- Experience with WebRTC, sockets, or streaming infrastructure
Who Will Love This Role
- Builders who thrive in ambiguous, fast-moving environments
- Engineers who enjoy taking an idea prototype production
- Creators excited about multimodal AI and the future of interaction
- People who want meaningful ownership and impact