Structural Design Labs
Ownership and publication surface
Governance Gate - Benchmark Review
Pair-first benchmark surface derived from per-run archives and machine indexes. Default scope is post-2.0 baseline onward from 2026-04-16 UTC.
Machine indexes:
runs/index.json ·
runs/benchmark_index.v1.json row evidence ·
runs/benchmark_index.v2.json paired benchmark view
Filters
Primary reading path
Paired benchmark evidence
Pairs are the primary benchmark interpretation path. This is not a model leaderboard.
Baseline definition: --
Rows in view
--
Non-pass rows
--
Models
--
Packs
--
Proof classes
--
Pair view summary
Loading pair metrics.
Evidence handling
Each pair row keeps baseline and gated evidence grouped. Primary links stay visible and archive/proof files stay one click away.
Loading paired benchmark groups.
Secondary evidence surface: recent row evidence
Secondary evidence surface
| Timestamp | Pack | Proof class | Model | Result | Leaks | Run ID | Evidence | CI |
|---|---|---|---|---|---|---|---|---|
| Loading row evidence. | ||||||||
Secondary grouped view by row_identity
Grouped output is derived from the same row entries with no added semantics.
| row_identity | Runs | Latest | Sequence new to old | Latest run |
|---|---|---|---|---|
| Loading grouped rows. | ||||