About the job Research Scientist Frontier Data - San Francisco, CA - $150K-$250K
Location: San Francisco, CA (in-person)
Compensation: $150,000 - $250,000 base, plus bonus and equity (total cash compensation can reach $250,000 - $450,000+)
Join a fast-growing AI infrastructure company as a Research Scientist, designing the datasets and evaluation frameworks that shape how frontier AI models are trained and measured.
What You'll Do
- Design data slices and explore data shapes that expose meaningful model failure modes across domains, including finance, code, and enterprise workflows
- Build and refine evaluation rubrics and reward signals for RLHF and RLVR training pipelines
- Model annotator behavior and run experiments to improve different model capabilities
- Develop quantitative frameworks for measuring dataset quality, diversity, and downstream impact on model alignment and capability
- Partner with research teams at the world's top AI labs to translate their training objectives into concrete data and evaluation specifications
- Move fast from hypothesis to experiment, extract actionable insights from messy results, and iterate quickly
What You'll Bring
- Strong quantitative instincts with familiarity with LLM training pipelines, RLHF or RLVR, or evaluation methodology, no PhD required
- A genuine, intrinsic obsession with how data structure, selection, and quality drive model behavior
- The ability to design lightweight experiments, move fast, and extract insights from messy or incomplete results
- Comfort working across domains such as finance, software engineering, and policy, with the ability to context-switch and reason clearly
- A strong bias toward building and shipping experiments over theorizing
Nice to Have
- Prior work or internship at an RL environment company, AI safety organization, or benchmarking organization
- Background in evaluation methodology, benchmark design, or dataset curation at a lab or research organization
- Exposure to annotator modeling, reward signal design, or alignment-related research
This is a high-leverage research seat where your work directly shapes how the next generation of frontier models learns, with outsized impact on a small, high-caliber team.