Judgment Labs — Agent Product Engineer

San Francisco, California, United States

Job Openings Judgment Labs — Agent Product Engineer

About the job Judgment Labs — Agent Product Engineer

Judgment Labs — Agent Product Engineer

Type: Full-time | On-site | San Francisco (FiDi), CA Compensation: $200,000–$350,000 + Competitive Equity Hiring count: 1 Visa sponsorship: Yes — H-1B Reports to: Not specified on the page

About Judgment Labs

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). Where traditional observability logs exceptions and latency, ABM surfaces behavioral anomalies — instruction drift, context retrieval loss — in production at scale. Hundreds of teams building autonomous agents rely on Judgment to understand post-deployment behavior: clustering patterns across conversations and workflows, correlating regressions to specific interaction types, and pinpointing where reliability breaks down.

Founded: Not stated | Team size: ~19 (per role body) | Total funding: $30M+ (two rounds in the past 5 months) Industry: AI infrastructure / agent observability & evals Website: judgmentlabs.ai Office: San Francisco (FiDi) Investors: Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others.

Why Candidates Should Join

Funded momentum: $30M+ raised across two rounds in five months from top-tier investors.
Build 0-to-1: ship agent capabilities from a blank page on a fast-scaling product surface.
Founding-track: small (~19-person) team; fast track to founding-level experience and direct customer interaction.
Perks: full benefits package, Equinox membership, private chef.

Intake Call Summary

No intake call summary was present on the role page. (flag — request from Contrario if one exists.)

The Role

Judgment Labs is hiring an Agent Product Engineer to build high-taste products for self-learning agents. The role is majority agent work on a full-stack baseline. Candidates can come from a front-end/design engineering background or an AI engineering background, but both must have prior experience building and designing agents, ideally at a startup with 0-1 product ownership.

What You'll Be Doing

Build high-taste agent products that pair powerful behavior with consumer-grade UX polish
Design and ship agent capabilities from 0 to 1 inside a fast-scaling product surface
Contribute across the full stack as needed, with the majority of work on agent infrastructure and product
Translate customer feedback on agent behavior into concrete product iterations
Help raise the bar on product taste and craft as the team grows past 19 people

Tech stack: TypeScript (full-stack); no further stack specified.

Requirements

3-7 years engineering
Agent or Applied AI background required
0-1 product ownership comfort
Front-end/design engineer or AI engineer
TypeScript fluency
Evals experience heavily preferred

Green Flags

Shipped agents in production at a startup. Can walk through context design, tool design, reasoning loop tradeoffs.
Strong product engineering range. Has shipped customer-facing product features end to end, not just backend infrastructure or research artifacts.
Demonstrably strong communicator. Customer calls, internal demos, technical writing, public talks, or comparable signal of comfort explaining complex AI concepts.
Prior evals, observability, or behavior-monitoring product experience. Direct adjacency to Judgment's space, heavily preferred.
Prior 0-1 product ownership at a seed or Series A startup. Built agents from a blank page.
High-intensity background signal: olympiad medals, debate competitions, competitive athletics, founder experience, or comparable high-output indicators.

Red Flags

No prior agent or Applied AI experience. The bar is production agent work.
Front-end or design engineering only without agent depth.
Backend or infrastructure only without product-shipping experience or customer-facing range.
Weak customer-facing skills. The seat is customer-facing, candidates who want heads-down only are not the fit.
Traditional ML or model-training background without agent system experience.
Cannot or will not work 5 days in person at the FiDi office.

Role Details

Salary$200,000–$350,000EquityCompetitive (not quantified)On-site policy5 days/week in office (Monday–Friday), FiDiVisa sponsorshipH-1BEmployment typeFull-timeLocationSan Francisco (FiDi), CA (header shows "Chinatown, CA" — see flag)

Screening Questions

(Contrario "Required Candidate Q&A" form fields)

Where are you currently located?
LinkedIn URL
Are you legally authorized to work in the United States?
Will you require work sponsorship now and/or in the future?
Are you located in the San Francisco-area and/or willing to relocate?
Are you willing to work on-site in our San Francisco office Monday-Friday?

Interview Process

Stage 1 — Pending Approval — Candidates awaiting initial approval. Stage 2 — Founder vibe check (30 min) — Plus optional 15-min deeper dive into technical projects. Stage 3 — Technical Interview (75 min) — Problem-solving (30 min) + role-specific interview (45 min). Stage 4 — Work Trial Stage 5 — Offer Extended Stage 6 — Candidate Hired — When the candidate accepts and starts.

Ideal Companies & Backgrounds

Pulled from the role page's Ideal Companies grid.

Clearly recognizable: Linear, Cursor, Ramp, Figma, Vercel, CockroachDB, Modal Labs, Anyscale, Runway, Applied Intuition, Anduril, Notion, Nomic AI, MotherDuck, LangChain

Ideal Candidate Profiles

None provided on the role page.

Rejected Candidate Feedback

None provided on the role page.

Or refer someone