About the job Founding Head of AI Infrastructure
Location: San Francisco Bay Area (Hybrid) or Remote
Experience Level: Senior / Lead
Job Type: Full-Time
About the Role
Were looking for a Founding Head of AI Infrastructure to own the performance and realism layer of our AI systems. This role focuses on pushing GPUs to their limits, scaling multimodal models, and delivering ultra-low-latency inference so that interactions feel instant and lifelike. Youll transform cutting-edge research into production-ready systems that scale globally.
What You'll Build
- Ultra-low-latency inference pipelines (<500ms) leveraging NVIDIA Triton, Ray, Kubernetes, and Terraform.
-
GPU/TPU-optimized training and deployment for multimodal models.
-
Performance monitoring and observability that balance speed with reliability.
-
Cost-efficient infrastructure capable of scaling to millions of users.
-
Research-to-production workflows that turn breakthroughs into products.
- Seamless interfaces with AI Engineering, Product, and Design to keep the system in sync across modalities.
You Might Be a Fit If You
- Have shipped ML-powered products at scale (consumer or developer-facing).
-
Live and breathe inference optimization and distributed ML infrastructure.
-
Can debug across the stack: CUDA kernels, GPU scheduling, Python services, and APIs.
-
Have experience with real-time ML, multimodal models, or video generation.
-
Care about user experience as much as FLOPs.
-
Thrive in founder-level roles where you own outcomes end to end.
Why This Role Is Special
- Founding role with deep ownership of technical vision and infrastructure strategy.
-
Define the stack and performance layer for next-generation AI systems.
-
Flexible hybrid schedule in San Francisco, or remote for outstanding candidates.
-
Competitive founding equity package alongside lean early-stage base comp.
-
Work with a team of engineers from top AI labs and cloud companies, building at the edge of whats possible.
Why This Role Is Special
- You've squeezed 10x performance gains from impossible GPU/TPU constraints.
-
Youve built real-time AI agents (speech, video, or multimodal) that delight users.
-
Youve published in ML/CV conferences or contributed to open-source ML systems.
-
You believe AI should augment human connection, not replace it.
Perks & Benefits
- Platinum-level health insurance (fully covered)
- 4% 401(k) match because long-term thinking matters
- Hybrid-friendly in SF Bay Area or fully remote for the right candidate
- Competitive early equity with outsized product ownership
- Fast-paced, mission-driven team that values thoughtful execution and experimentation
- Work on something weird, personal, powerful and potentially world-changing
If you're excited about creating the emotional core of human-AI interaction, and want to work at the frontier of LLMs, agents, and real-time interfaces wed love to hear from you.