Replay

Decision record replay

Each record below is a complete audicta.pa.v1 document — content-addressed, evaluator-isolated, byte-verifiable in your browser. Click a case to read the full reasoning chain and verify the SHA-256 content_hash yourself.

Curated invocation set

Records are added by hand from the authorized live invocation queue. Each one was reviewed before publication; the corpus is not an automatic feed.

A log line, vs. an evidence record

Two records of the same denial decision. The left is what an observability stack typically captures. The right is what Audicta produces — at decision time, sealed by content hash.

Without Audicta

What your AI gives you today

{
  "ts":     "2026-04-25T14:31:44Z",
  "model":  "gpt-4-turbo",
  "input":  "[case_007 prompt …]",
  "output": "DENY"
}

A log line.

You can defend exactly one thing: that the model said it. The reasoning, the alternatives, the evidence base, the cross-check — all gone. Months from now, an audit asking why has nothing to read.

With Audicta

What an audicta.pa.v1 record gives you

{
  "record_version": "audicta.pa.v1",
  "record_id":      "rec_2026-04-25T14-31-44Z…",
  "case":           { "case_id": "case_007", … },
  "agent_chain": [
    { "agent": "clinical_reviewer",  "output": { what, why,
        alternatives_considered, tradeoff, citations } },
    { "agent": "criteria_mapper",    "output": { … } },
    { "agent": "evidence_retriever", "output": { … } }
  ],
  "decision": {
    "verdict":      "DENY",
    "verdict_basis": "LCD §B.1 prerequisites unmet …",
    "alternative_pathway_cited": { "service": "…", "covered": true }
  },
  "evaluation": {
    "evaluator_local":  { "model": "qwen2.5-coder:7b", "scores": {…} },
    "evaluator_cloud":  { "model": "claude-sonnet-4",  "scores": {…} },
    "convergence":      { "ceo_flag": true, "max_div": 1.0 }
  },
  "provenance": {
    "agent_genome_hash": "sha256:…",
    "kb_snapshot_hash":  "sha256:…"
  },
  "content_hash": "sha256:…"
}

An evidence record.

You can defend the reasoning, the alternatives considered, the citations, the cross-evaluator agreement, and the integrity of the document itself — independently. The difference is not "more logging." It's a different kind of artifact.

Cases in the corpus

borderline deny

DENY

Lumbar MRI · Borderline deny · Incomplete conservative trial

A patient with three weeks of mild low back pain asks for a lumbar MRI. Insurance requires the patient to first complete a documented physical-therapy trial — this patient attended only 2 of 6 prescribed sessions over three weeks, so the record denies the imaging and points to the physical-therapy program insurance actually covers as the next step.

Why this case is in the corpus: When criteria miss, the record cites why AND what to do instead — denial paired with a covered alternative pathway (LCD §D). The system flags its own borderline cases for human review via ceo_flag.

Read the record

clear approve

APPROVE

Lumbar MRI · Clear approve · Red-flag bypass

A patient with worsening nerve symptoms (numbness, weakness) asks for a lumbar MRI. The clinical findings are 'red flags' that bypass the usual requirement to wait four weeks and try basic care first, so the record approves the imaging immediately.

Why this case is in the corpus: When the input is unambiguous, the output is unambiguous: a clean record, dense citations, evaluators tightly converge.

Read the record

The same case, two agent versions

Audicta's reproducibility property applied across time. Both records below describe case_007; the only differences are the agent genome hash, the KB snapshot hash, and the evaluator scores. Each one byte-verifies under its own configuration. Improvement platforms cannot show you the earlier decision after the agent moves on — the architecture preserves it on purpose.

Earlier · 2026-04-25

DENY

LCD L34220 §B.1 prerequisites unmet (duration + adequate trial); ACR AC LBP variant 1 rating 1/9. Covered alternative cited: LCD §D supervised PT extension to 4–6 weeks with re-evaluation.

agent_genome_hash: sha256:genome-hea…
max_dimension_divergence: 1
ceo_flag: true

Read the v1 record →

Later · 2026-04-26

DENY

LCD L34220 §B.1 prerequisites unmet; compliance gap is the binding factor (PT 2-of-6) with duration as secondary (3 of 4 weeks). ACR AC LBP variant 1 rating 1/9 unchanged in 2026-04 KB. Covered alternative cited: LCD §D supervised PT extension to 4–6 weeks with documented compliance and a structured re-evaluation visit (telehealth or in-person).

agent_genome_hash: sha256:genome-hea…
max_dimension_divergence: 0.5
ceo_flag: false

Read the v2 record →

Same outcome, refined reasoning, tighter evaluator convergence — that's an agent improving. The earlier record is preserved exactly as v1 produced it (its ceo_flag is still true; its max_dimension_divergence is still 1.0), and its content_hash still verifies under v1's configuration. Three years from now an auditor can replay either one. Architecture details the reproduction harness end to end.