Job Openings Judgment Labs — Agent Product Engineer

About the job Judgment Labs — Agent Product Engineer

Judgment Labs — Agent Product Engineer

Type: Full-time | On-site | San Francisco (FiDi), CA Compensation: $200,000–$350,000 + Competitive Equity Hiring count: 1 Visa sponsorship: Yes — H-1B Reports to: Not specified on the page

About Judgment Labs

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). Where traditional observability logs exceptions and latency, ABM surfaces behavioral anomalies — instruction drift, context retrieval loss — in production at scale. Hundreds of teams building autonomous agents rely on Judgment to understand post-deployment behavior: clustering patterns across conversations and workflows, correlating regressions to specific interaction types, and pinpointing where reliability breaks down.

Founded: Not stated | Team size: ~19 (per role body) | Total funding: $30M+ (two rounds in the past 5 months) Industry: AI infrastructure / agent observability & evals Website: judgmentlabs.ai Office: San Francisco (FiDi) Investors: Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others.

Why Candidates Should Join

  • Funded momentum: $30M+ raised across two rounds in five months from top-tier investors.
  • Build 0-to-1: ship agent capabilities from a blank page on a fast-scaling product surface.
  • Founding-track: small (~19-person) team; fast track to founding-level experience and direct customer interaction.
  • Perks: full benefits package, Equinox membership, private chef.

Intake Call Summary

  • No intake call summary was present on the role page. (flag — request from Contrario if one exists.)

The Role

Judgment Labs is hiring an Agent Product Engineer to build high-taste products for self-learning agents. The role is majority agent work on a full-stack baseline. Candidates can come from a front-end/design engineering background or an AI engineering background, but both must have prior experience building and designing agents, ideally at a startup with 0-1 product ownership.

What You'll Be Doing

  • Build high-taste agent products that pair powerful behavior with consumer-grade UX polish
  • Design and ship agent capabilities from 0 to 1 inside a fast-scaling product surface
  • Contribute across the full stack as needed, with the majority of work on agent infrastructure and product
  • Translate customer feedback on agent behavior into concrete product iterations
  • Help raise the bar on product taste and craft as the team grows past 19 people

Tech stack: TypeScript (full-stack); no further stack specified.

Requirements

  • 3-7 years engineering
  • Agent or Applied AI background required
  • 0-1 product ownership comfort
  • Front-end/design engineer or AI engineer
  • TypeScript fluency
  • Evals experience heavily preferred

Green Flags

  • Shipped agents in production at a startup. Can walk through context design, tool design, reasoning loop tradeoffs.
  • Strong product engineering range. Has shipped customer-facing product features end to end, not just backend infrastructure or research artifacts.
  • Demonstrably strong communicator. Customer calls, internal demos, technical writing, public talks, or comparable signal of comfort explaining complex AI concepts.
  • Prior evals, observability, or behavior-monitoring product experience. Direct adjacency to Judgment's space, heavily preferred.
  • Prior 0-1 product ownership at a seed or Series A startup. Built agents from a blank page.
  • High-intensity background signal: olympiad medals, debate competitions, competitive athletics, founder experience, or comparable high-output indicators.

Red Flags

  • No prior agent or Applied AI experience. The bar is production agent work.
  • Front-end or design engineering only without agent depth.
  • Backend or infrastructure only without product-shipping experience or customer-facing range.
  • Weak customer-facing skills. The seat is customer-facing, candidates who want heads-down only are not the fit.
  • Traditional ML or model-training background without agent system experience.
  • Cannot or will not work 5 days in person at the FiDi office.

Role Details

Salary$200,000–$350,000EquityCompetitive (not quantified)On-site policy5 days/week in office (Monday–Friday), FiDiVisa sponsorshipH-1BEmployment typeFull-timeLocationSan Francisco (FiDi), CA (header shows "Chinatown, CA" — see flag)

Screening Questions

(Contrario "Required Candidate Q&A" form fields)

  1. Where are you currently located?
  2. LinkedIn URL
  3. Are you legally authorized to work in the United States?
  4. Will you require work sponsorship now and/or in the future?
  5. Are you located in the San Francisco-area and/or willing to relocate?
  6. Are you willing to work on-site in our San Francisco office Monday-Friday?

Interview Process

Stage 1 — Pending Approval — Candidates awaiting initial approval. Stage 2 — Founder vibe check (30 min) — Plus optional 15-min deeper dive into technical projects. Stage 3 — Technical Interview (75 min) — Problem-solving (30 min) + role-specific interview (45 min). Stage 4 — Work Trial Stage 5 — Offer Extended Stage 6 — Candidate Hired — When the candidate accepts and starts.

Ideal Companies & Backgrounds

Pulled from the role page's Ideal Companies grid.

Clearly recognizable: Linear, Cursor, Ramp, Figma, Vercel, CockroachDB, Modal Labs, Anyscale, Runway, Applied Intuition, Anduril, Notion, Nomic AI, MotherDuck, LangChain

Ideal Candidate Profiles

None provided on the role page.

Rejected Candidate Feedback

None provided on the role page.