About the job Software Engineer RL Environments - San Francisco, CA - $180K-$220K
Location: San Francisco, CA (in-person)
Compensation: $180,000 - $220,000 base, plus substantial profit share and competitive equity (expected total cash compensation around $500,000)
Join a fast-growing AI infrastructure company as a Software Engineer focused on RL environments, designing the datasets and evaluation rubrics that directly shape how frontier AI models learn.
What You'll Do
- Design data slices and explore data shapes that expose meaningful model failure modes across domains like finance, code, and enterprise workflows
- Build and refine evaluation rubrics and reward signals for RLHF and RLVR training pipelines
- Model annotator behavior and run experiments to improve different model capabilities
- Develop quantitative frameworks for measuring dataset quality, diversity, and downstream impact on model alignment and capability
- Create and manage both real-world and synthetic data pipelines
- Partner with research teams at top AI labs to translate their training objectives into concrete data and evaluation specifications
What You'll Bring
- 1-4 years of software engineering experience with strong technical depth
- A genuine obsession with how data structure, selection, and quality drive model behavior
- The ability to design lightweight experiments, move fast, and extract actionable insights from messy results
- Comfort working across domains such as finance, software engineering, and policy
- A strong track record of shipping, with a clear bias toward building over theorizing
Nice to Have
- Prior work or internship at an RL environment company, AI safety organization, or benchmarking organization
- Experience as a founder or early engineer at an early-stage startup
- Experience building real-world and synthetic data pipelines
- Familiarity with RLHF or RLVR training pipelines
This is a high-leverage engineering seat with direct impact on how frontier AI models are trained, working hands-on with research teams at the world's leading AI labs.