Extending HilbertBench¶
HilbertBench is built to be extended in three places: new analyzers, new framework integrations, and new trace-schema fields. Each has a safe path and a few rules that keep an extension from quietly breaking the guarantees the framework rests on. The full contributor checklist is in CONTRIBUTING.md; this page is the architectural how-to.
Adding an analyzer¶
An analyzer is the easiest thing to add, because it only reads.
A new analyzer is a plain function: it takes a trace (or a run-directory
path) and returns a dictionary. It lives in hilbertbench/analysis/,
and it must follow the same contract as the existing six:
- Read-only. It never writes to the trace (INV-002).
Take a
HilbertTrace, resolve what you need, compute, return. - Evidence in, dictionary out. Return a
statusstring plus the quantitative evidence behind it. Do not invent a verdict the numbers do not support. - Quantify uncertainty. If you estimate a statistic, attach a
bootstrap confidence interval (see
analysis/_util.pyfor the shared helper) and let the verdict report low confidence when the interval straddles a threshold. - Degrade, don't guess. If the trace lacks what you need (an old trace with no calibration snapshot, say), return an "insufficient data" status — never a fabricated default (INV-008).
- Document the threshold. Any decision cutoff is a named constant with a comment explaining where the number came from. Expose it as a function argument so callers can override it.
Add a test in tests/analysis/ that plants a known condition in a
constructed trace and asserts the verdict, then regenerate the
test catalog.
Adding a framework integration¶
An integration is a transparent proxy. It is the only place a heavy quantum library may be imported, and it carries the heaviest responsibility, because it runs during the user's experiment.
- Parity is sacred. The proxy must not change the number of shots, executions, parameter bindings, or observables (INV-001). Forward the call to the real backend, wait for the result the user would have gotten anyway, and copy it aside. Never re-execute.
- Record after, not instead. Run the real call first; record from its result. Recording must never sit on the critical path in a way that could alter timing or fail the user's job.
- Failures are visible. Wrap recording so that a problem writing
the trace cannot break the user's run, but is itself logged as an
ERRORevent, not silently dropped (INV-007). - Import locally. Keep the framework import inside the integration module so the core stays dependency-free (INV-004), and add a clear message if an optional dependency is missing.
The existing Qiskit and PennyLane proxies are the templates to copy.
Adding a trace-schema field¶
This is the one with the strictest rules, because the trace format is a long-lived contract.
- Edit the schema, never the models. Change the JSON Schema in
schemas/, then regenerate the Python models (INV-003). See the schema guide. - Respect the version freeze. Once a schema version is tagged it is
frozen forever (INV-005). A new field goes into
a new version directory (
v1.1/), not into a released one. - New fields are optional and degrade gracefully. Anything not
present since the first release must be optional and nullable, so a
reader of an older trace resolves it to
None(INV-008). - Evidence only. The new field records what happened, never an
interpretation of it (INV-006). Fields like
is_convergedorquality_scoredo not belong in a trace.
The rule behind the rules¶
Every extension rule traces back to one question: does this change keep the trace trustworthy? If an addition could let the recorder perturb an experiment, let a trace be edited after the fact, or let an analyzer invent data, it is rejected — regardless of how useful it seems. The compliance suite exists to catch the cases where good intentions would have broken a guarantee.