Files
mizan/tests/afi/test_capability_parity.py
Ryth Azhur 58d2cb2848 AFI parity: generate the matrix from conformance probes, not prose
The per-adapter parity table was hand-maintained prose. An adapter that
never wired a capability (FastAPI SSR, Axum WebSocket) got its gap
relabelled "Django-only" or "out of scope — use native equivalents," and
nothing went red. The de-scope was crystallized in five mutually-ratifying
sites: the README §Stack-extensions table, the AFI fixture docstring
("channels/forms/shapes aren't AFI-common"), the core registry's
extension-hook framing, the mizan-fastapi __init__ docstring, and a
"CSRF is Django-only" comment in two adapters' session endpoints.

Replace prose-parity with conformance-generated parity:

- tests/afi/manifest.py declares the AFI-common surface as data — one list
  of capabilities, one of adapters. Applicability ("—") is derived from
  transport, never typed.
- tests/afi/probes.py independently inspects each backend's source for the
  artifact a capability requires (comment-stripped, backend-scoped). Green
  means wired; a cell can't be set by editing a word.
- tests/afi/test_capability_parity.py asserts every (capability × applicable
  adapter) pair is wired. 35 unwired gaps are now loud red TFDD tests, each
  naming an owed binding. No xfail/skip.
- tests/afi/parity_table.py generates the README table from the probes;
  `make parity-check` fails CI on any hand-edit, like the codegen byte-parity.

Purge the five de-scope sites. The IR byte-parity gate is unchanged and green.
`make test-afi` is now intentionally red on the 35 gaps — that board is the
owed parity work, itemized; a gap turns green by being wired, never described.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 12:58:03 -04:00

119 lines
5.1 KiB
Python

"""
AFI capability parity — the runtime/surface conformance gate.
`test_codegen_parity.py` gates that the three backends emit byte-identical KDL.
That is necessary but narrow: it proves the IR agrees, not that an adapter
actually *implements* the capabilities the IR describes. The vacuum left by
"IR-shape only" is exactly where parity drifted — an adapter that never wired
SSR or WebSocket got its gap relabelled "Django-only" or "out of scope," and
nothing in the suite objected.
This module closes that vacuum. It parametrizes over every (capability,
applicable-adapter) pair drawn from `manifest.py` and asserts the adapter
actually wires the capability (`probes.py`). It is designed to be RED wherever a
gap is real — each failure names one owed binding. That redness is not a broken
build; it is the board of owed work, itemized and loud, that the prior prose
table hid behind false-green. A gap turns green by being *wired*, never by being
*described*.
Applicability is derived in `manifest.applies()` from the adapter's declared
transport, so a capability that simply does not exist over a transport (header
invalidation over Tauri IPC) is not parametrized here at all — it is a "" in
the generated table, computed, not a verdict anyone typed.
"""
from __future__ import annotations
import pytest
from manifest import ADAPTERS, CAPABILITIES, CAPABILITIES_BY_ID, PROBES_REQUIRED, applies
from probes import PROBES, run_probe
def _applicable_pairs() -> list[tuple[str, str, str]]:
"""(capability_id, adapter_id, test_id) for every pair the protocol applies to."""
pairs: list[tuple[str, str, str]] = []
for cap in CAPABILITIES:
for adapter in ADAPTERS:
if applies(cap, adapter):
pairs.append((cap.id, adapter.id, f"{cap.id}::{adapter.id}"))
return pairs
_PAIRS = _applicable_pairs()
@pytest.mark.parametrize(
"capability_id,adapter_id",
[(c, a) for c, a, _ in _PAIRS],
ids=[tid for _, _, tid in _PAIRS],
)
def test_adapter_wires_capability(capability_id: str, adapter_id: str) -> None:
"""The adapter must wire the AFI-common capability the protocol declares.
A failure here is one owed binding, not a regression. The message names the
capability, the adapter, and what the probe could not find — that string is
the gap's specification until someone closes it by wiring the artifact.
"""
adapter = next(a for a in ADAPTERS if a.id == adapter_id)
cap = CAPABILITIES_BY_ID[capability_id]
result = run_probe(capability_id, adapter)
assert result.state == "pass", (
f"AFI parity gap — {adapter.title} does not wire '{cap.title}'.\n"
f" probe: {result.detail}\n"
f" state: {result.state}"
+ (" (◑ partial — declared/stubbed but not complete)" if result.state == "partial" else "")
+ f"\n This capability is AFI-common (manifest tier: {cap.tier.value}); every "
f"adapter owes a binding. Close it by wiring the artifact the probe looks for — "
f"not by editing a table."
)
# ─── Meta-conformance: the manifest and the probes must stay in lockstep ───────
def test_every_capability_has_a_probe() -> None:
"""No capability may be declared without a probe — else it is unverifiable
and silently 'passes' by never being checked, recreating the original hole."""
missing = [c.id for c in CAPABILITIES if c.id not in PROBES]
assert not missing, (
f"Capabilities declared in manifest.py with no probe in probes.py: {missing}. "
f"An unprobed capability is an un-gated parity claim — exactly the drift this "
f"suite exists to prevent."
)
def test_no_orphan_probes() -> None:
"""No probe may exist for a capability the manifest doesn't declare — that
would be dead detection code drifting from the surface it claims to check."""
orphans = [pid for pid in PROBES if pid not in CAPABILITIES_BY_ID]
assert not orphans, (
f"Probes in probes.py with no matching capability in manifest.py: {orphans}."
)
def test_probe_count_matches_required() -> None:
"""Sanity pin: the manifest's own count of required probes equals the probe set."""
assert len(PROBES) == PROBES_REQUIRED, (
f"probes.py defines {len(PROBES)} probes; manifest expects {PROBES_REQUIRED}."
)
def test_readme_parity_table_is_current() -> None:
"""The README parity table is generated output; a hand-edit must fail here.
This is the lock that makes the original lie inexpressible. The table can no
longer be edited to read 'Django-only' — it is spliced from the probe results
by `parity_table.py`, and this test asserts the committed block matches a fresh
regeneration. Drift → red → `make parity-table`.
"""
import parity_table
text = parity_table.README.read_text(encoding="utf-8")
regenerated = parity_table._splice(text, parity_table.generate_block())
assert regenerated == text, (
"README parity table is stale or hand-edited. It is generated from the "
"conformance probes — run `make parity-table` to regenerate it."
)