Claim Check · BYO-Judge

Ship agent code with confidence

Coding agents help your team ship faster — but a plausible-looking diff doesn't mean the change did what the task asked. Claim Check catches that gap before review using deterministic probes, with an optional second opinion from your own agent (BYO-Judge).

Your agent finishes a task and writes a confident summary — “Added retry handling for failed Stripe payment webhooks.” The diff looks plausible. But did it bind a handler to the payment-failure event, or refactor something nearby and call it done? Claim Check answers that one question, so you can ship the change instead of stopping to re-read the whole diff. Deterministic probes check the change against the specific, falsifiable expectations the task sets up — and when you want a second opinion on the long tail, it comes from your own agent, clearly labeled and advisory. You move faster because you can trust what the agent shipped, not because you skipped the check.

  • Catch claim gaps before review
  • BYO-Judge runs in your own agent
  • No Shipmoor model · no source upload
  • deterministic decides
  • LLM only advises
  • BYO-Judge
  • no hosted model
  • no source upload
$ shipmoor scan --diff main...HEAD \    --intent "persist the order and charge the customer, then emit order.paid" \    --agent "claude -p" --author-model-id my-authoring-model Claim check  GAP DISCLOSED  ·  coverage 3/4probes · deterministic  ✓ satisfied     order row persisted to orders  ✗ unsatisfied   payment captured on checkout  ◦ cannot_check  refund path — no probe yet llm_inferred · BYO-Judge (claude -p) · advisory second opinion  ~ change may charge in a sibling service — verify manually 1 gap caught before review — surfaced while it's still cheap to fix

Claim Check on a payment change: deterministic probes find the gap; the BYO-Judge (your own agent) only advises.

Run it

Claim Check appears when you scan a changeset and supply the task's intent. The BYO-Judge is opt-in and only runs on the long tail, at medium-or-higher intent confidence.

  • Check a change against its task

    shipmoor scan --staged --intent "add retry to the webhook client"
  • Two agreeing sources raise confidence

    shipmoor scan --staged --intent "…" --prompt "…"
  • Opt into the BYO-Judge (your own agent)

    SHIPMOOR_INTENT_DRIFT_STAGE3=1 shipmoor scan --diff main...HEAD --intent "…" --agent "claude -p"
  • Assert judge isolation

    shipmoor scan --diff main...HEAD --intent "…" --agent "codex exec" --author-model-id my-model --strict-judge-isolation
  • Turn the gate on (deterministic only)

    shipmoor scan --staged --intent "…" --prompt "…" --verdict-policy .shipmoor/verdict-policy.yaml

No intent supplied? The scan output is unchanged from a plain Community scan. Offline (SHIPMOOR_OFFLINE=1) disables the BYO-Judge entirely.

Your model, your machine, your call

Claim Check runs locally. The deterministic core never leaves your machine, and the optional LLM second opinion runs in your own agent under your own provider — Shipmoor hosts no model and uploads no source. The result is a verdict you can defend, not a vibe you have to trust.

How Claim Check works

Four steps, and only deterministic evidence moves the verdict. The LLM, when you opt in, only ever advises — so you act on falsifiable evidence, not a guess.

  1. Resolve the intent
  2. Deterministic probes check it
  3. BYO-Judge advises (opt-in)
  4. Verdict + evidence

The result is advisory by default — it surfaces the gap and stays out of your way. If you choose to gate CI, only deterministic evidence counts toward the verdict; a low-confidence intent or an LLM opinion never can. You stay in control of what ships.

Ship your next agent change with confidence

Install the free CLI, sign in to Shipmoor IC, and check your next agent change against the task it was given.

Get Shipmoor CLI

One installer. One shipmoor command. Free Community scans.

curl -fsSL https://dl.shipmoor.dev/install.sh | bash


Claim Check docs

Claim Check & BYO-Judge FAQ FAQ

How the deterministic core and the optional LLM second opinion fit together.

Contact sales

Our team can help with custom support, team rollouts, and self-hosted deployments. Or to get started now, explore our self-serve plans.