GDTK
Conformance Substrate. The Governed Decision Test Kit will grade AI systems on the G0–G7 ladder. v0.1 ships in stages: 20 cases first, 50 by Week 4, 100 by the 2026-06-30 Seal 02 deadline.
The neutral test
GovernedAI does not certify AI systems. We test them. The test is what makes the publication neutral.
GDTK-100 is the planned public conformance test kit. The full kit is one hundred cases — forty pass, forty fail, twenty ambiguous — that will grade any AI system on whether its decisions satisfy the eight properties of a governed AI decision. The kit will be open-source under MIT. Anyone will be able to run it. Anyone will be able to submit results. Verdicts will be published in the conformance registry as the kit ships.
The kit ships in stages with public verifier artifacts at each stage:
| Stage | Target date | Cases | Status |
|---|---|---|---|
| v0.1.0 | ~2026-05-17 | 20 cases (subset of categories 1–4) | SPECIFIED |
| v0.1.5 | ~2026-05-31 | 50 cases (full coverage of categories 1–5) | SPECIFIED |
| v0.1 final | 2026-06-30 (Seal 02 deadline) | 100 cases (all 10 categories) | SPECIFIED |
If 2026-06-30 arrives and v0.1 final has not shipped, this page demotes the stage status to SPECIFIED PAST DEADLINE and a public correction RDL entry is logged. The receipts doctrine applies to OptimaX before it applies to anyone else.
The G0–G7 ladder
| Level | Name | Threshold |
|---|---|---|
| G0 | Ungoverned | The system acts. No artifact. |
| G1 | Logged | Audit trail exists. Not signed. |
| G2 | Signed | Artifact signed. Replay incomplete. |
| G3 | Replayable | Signed + replay bundle + verifier. |
| G4 | Conformant | G3 + GDTK-100 pass + external verifier output. |
| G5 | Improving | System reduces defects, waste, replay divergence over time, measured via SPC discipline on D-TAX/W-TAX baselines, with software-enforced anti-sprawl rules and asymmetric re-baselining (tighten on improvement, never relax on regression). PEL is the engine that produces G5 evidence. |
| G6 | Federated | Multiple operators recognize and verify each other's receipts. Cross-operator portable artifact format. |
| G7 | Institutional | Regulators, insurers, auditors, and procurement teams treat the receipt as the unit of trust. "Show the receipt" becomes the market default. |
G5 onward is published as horizon doctrine without committed dates. G7 has a probabilistic framing: by 2031, with probability ≥ 40%, the market rule for consequential AI will be "show the receipt." Falsifiable bet, not certain prediction.
Test categories
GDTK-100 v0.1 covers ten test categories with ten cases each:
- 1. Signature integrity
- 2. Replay determinism
- 3. Policy bundle binding
- 4. Authority chain validation
- 5. Input provenance trace
- 6. Suppression record completeness
- 7. Review path machinability
- 8. Verifier state honesty
- 9. Override-event logging
- 10. Cross-decision hash chain
Conformance results registry
Once GDTK-100 v0.1 ships (2026-06-30), test results are published at /conformance/results/. Submissions are public. Each submitting AI system can request a re-test after fixing failures. Results carry a maturity badge (UNSUBMITTED · SUBMITTED · VERIFIED · DISPUTED).
OptimaX's own systems are scored first and publicly. The receipts doctrine applies to us before it applies to anyone else.
Predicted scores · UNSUBMITTED
Predicted scores for major commercial AI governance products will be published as UNSUBMITTED_PUBLIC_EVIDENCE_SCORE labels with explicit confidence intervals and an invitation-to-contest. We do not imply a competitor "failed" GDTK unless they submitted or unless their public artifacts can be transparently scored. Predictions are predictions; verdicts come from submissions.
IETF SCITT
An Internet-Draft submission to the IETF Supply Chain Integrity, Transparency, and Trust working group is scheduled for 2026-08-02. The submission frames RDL receipts as a profile of SCITT's transparent statement format, positioning governed AI receipts as a member of the SCITT family rather than a parallel standard.