Replay
Decision record replay
Each record below is a complete audicta.pa.v1 document — content-addressed, evaluator-isolated, byte-verifiable
in your browser. Click a case to read the full reasoning chain and
verify the SHA-256 content_hash yourself.
Curated invocation set
Records are added by hand from the authorized live invocation queue. Each one was reviewed before publication; the corpus is not an automatic feed.
A log line, vs. an evidence record
Two records of the same denial decision. The left is what an observability stack typically captures. The right is what Audicta produces — at decision time, sealed by content hash.
Without Audicta
What your AI gives you today
{
"ts": "2026-04-25T14:31:44Z",
"model": "gpt-4-turbo",
"input": "[case_007 prompt …]",
"output": "DENY"
} A log line.
You can defend exactly one thing: that the model said it. The reasoning, the alternatives, the evidence base, the cross-check — all gone. Months from now, an audit asking why has nothing to read.
With Audicta
What an audicta.pa.v1 record gives you
{
"record_version": "audicta.pa.v1",
"record_id": "rec_2026-04-25T14-31-44Z…",
"case": { "case_id": "case_007", … },
"agent_chain": [
{ "agent": "clinical_reviewer", "output": { what, why,
alternatives_considered, tradeoff, citations } },
{ "agent": "criteria_mapper", "output": { … } },
{ "agent": "evidence_retriever", "output": { … } }
],
"decision": {
"verdict": "DENY",
"verdict_basis": "LCD §B.1 prerequisites unmet …",
"alternative_pathway_cited": { "service": "…", "covered": true }
},
"evaluation": {
"evaluator_local": { "model": "qwen2.5-coder:7b", "scores": {…} },
"evaluator_cloud": { "model": "claude-sonnet-4", "scores": {…} },
"convergence": { "ceo_flag": true, "max_div": 1.0 }
},
"provenance": {
"agent_genome_hash": "sha256:…",
"kb_snapshot_hash": "sha256:…"
},
"content_hash": "sha256:…"
} An evidence record.
You can defend the reasoning, the alternatives considered, the citations, the cross-evaluator agreement, and the integrity of the document itself — independently. The difference is not "more logging." It's a different kind of artifact.
Cases in the corpus
borderline deny
DENYLumbar MRI · Borderline deny · Incomplete conservative trial
A patient with three weeks of mild low back pain asks for a lumbar MRI. Insurance requires the patient to first complete a documented physical-therapy trial — this patient attended only 2 of 6 prescribed sessions over three weeks, so the record denies the imaging and points to the physical-therapy program insurance actually covers as the next step.
Why this case is in the corpus: When criteria miss, the record cites why AND what to do instead — denial paired with a covered alternative pathway (LCD §D). The system flags its own borderline cases for human review via ceo_flag.
Read the record
clear approve
APPROVELumbar MRI · Clear approve · Red-flag bypass
A patient with worsening nerve symptoms (numbness, weakness) asks for a lumbar MRI. The clinical findings are 'red flags' that bypass the usual requirement to wait four weeks and try basic care first, so the record approves the imaging immediately.
Why this case is in the corpus: When the input is unambiguous, the output is unambiguous: a clean record, dense citations, evaluators tightly converge.
Read the record
The same case, two agent versions
Audicta's reproducibility property applied across time. Both
records below describe case_007;
the only differences are the agent genome hash, the KB
snapshot hash, and the evaluator scores. Each one
byte-verifies under its own configuration. Improvement
platforms cannot show you the earlier decision after the
agent moves on — the architecture preserves it on purpose.
Earlier · 2026-04-25
DENY
LCD L34220 §B.1 prerequisites unmet (duration + adequate trial); ACR AC LBP variant 1 rating 1/9. Covered alternative cited: LCD §D supervised PT extension to 4–6 weeks with re-evaluation.
- agent_genome_hash
- sha256:genome-hea…
- max_dimension_divergence
- 1
- ceo_flag
- true
Read the v1 record →
Later · 2026-04-26
DENY
LCD L34220 §B.1 prerequisites unmet; compliance gap is the binding factor (PT 2-of-6) with duration as secondary (3 of 4 weeks). ACR AC LBP variant 1 rating 1/9 unchanged in 2026-04 KB. Covered alternative cited: LCD §D supervised PT extension to 4–6 weeks with documented compliance and a structured re-evaluation visit (telehealth or in-person).
- agent_genome_hash
- sha256:genome-hea…
- max_dimension_divergence
- 0.5
- ceo_flag
- false
Read the v2 record →
Same outcome, refined reasoning, tighter evaluator
convergence — that's an agent improving. The earlier record
is preserved exactly as v1 produced it (its
ceo_flag
is still true;
its max_dimension_divergence
is still 1.0),
and its content_hash still
verifies under v1's configuration. Three years from now an
auditor can replay either one. Architecture
details the reproduction harness end to end.