Diagnose quantum ML.
Without changing
your code.
Wrap one object. Record every circuit execution. Six built-in analyzers tell you exactly what is going wrong — barren plateaus, shot noise, stalled convergence, and more.
# Before: standard Qiskit VQE estimator = StatevectorEstimator() # After: add two imports, wrap one object from hilbertbench.integrations.qiskit import HilbertEstimatorProxy from hilbertbench.recorder.tape import HilbertTape with HilbertTape("runs/my_vqe") as tape: estimator = HilbertEstimatorProxy(tape) # ← only change # ... your optimizer loop, completely unchanged ... # Analyze after the run trace = HilbertTrace(tape.dir_path) result = detect_barren_plateau(trace) # {'status': 'Trainable', 'variance': 0.215, ...}
Built for scientific reproducibility
Every design decision protects the integrity of your experimental record.
Non-intrusive by design
The proxy intercepts results without modifying them. It never re-executes circuits, never changes shot counts, and never injects synthetic jobs into your hardware queue. Your algorithm runs exactly as if HilbertBench were not there.
Cryptographically sealed traces
Every trace is append-only during recording and sealed with a SHA-256 hash on close. Tampering is detectable. Share a run directory with a collaborator and they can verify the evidence has not changed — before or after publication.
Evidence, not verdicts
The trace stores raw facts: circuits submitted, parameters bound, outcomes observed. Diagnostic conclusions are computed at read time and never written back. Re-analyze any trace with different thresholds or new algorithms — the evidence is permanent.
Six diagnostic axes. One function call each.
Pass a run directory or a HilbertTrace object.
Every analyzer returns a plain dict — compose them freely or run all
at once with summary(trace).
Barren Plateau
Is the cost-landscape variance exponentially vanishing? Detects the signature of untrainable parameterized circuits (McClean et al., 2018).
detect_barren_plateau(trace)Shot Noise Ratio
Is the optimizer chasing signal or chasing noise? Computes the ratio of empirical outcome variance to the theoretical 1/shots shot-noise floor.
shot_noise_ratio(trace)Optimization Convergence
Is the parameter trajectory still moving? Tracks the late-to-early movement ratio and the outcome envelope across the full training run.
optimization_convergence(trace)Circuit Structure
How deep, how many qubits, what gate composition? Parses the stored OpenQASM to report depth, entangling fraction, and trainable parameter count.
circuit_structure(trace)Expressibility (KL)
How uniformly does the ansatz cover Hilbert space? KL divergence of the empirical fidelity distribution against the Haar-random reference (Sim et al., 2019).
kl_expressibility(active_trace)Noise Profile
What hardware noise did the run experience? Reads the calibration snapshot (T1, T2, readout error) and estimates the circuit fidelity under the NISQ model.
noise_profile(trace)End-to-end tutorials
Each tutorial takes a real quantum ML failure mode, adds HilbertBench with one code change, and walks through the diagnostic output with the science behind it.
Why Isn't My VQE Converging?
Diagnosing Barren Plateaus
A 4-qubit hardware-efficient ansatz flatlines after a few optimizer steps. HilbertBench reveals a variance of 0.0007 — a barren plateau. Compare it side-by-side against a shallow trainable circuit.
Read tutorial →Am I Using Enough Shots?
Diagnosing Shot Noise
A QAOA optimizer fails to find good solutions with 16 shots. HilbertBench shows SNR = 0.75 — the optimizer is chasing noise, not signal. Increasing to 1024 shots raises SNR to 38.9 and restores convergence.
Read tutorial →Expressibility vs Trainability:
Measuring the Tradeoff
Holmes et al. (2022) proved that high expressibility implies barren plateaus. Here we measure both axes on shallow, medium, and deep ansatze and find the sweet spot — in three added lines of code.
Read tutorial →How Hardware Noise Degrades
Your Results
Use circuit_structure to predict fidelity degradation
before touching the hardware queue. Compare ideal vs. noisy Aer
runs and build the depth-vs-fidelity contour for your device.
Running in under 5 minutes
No restructuring. No new primitives to learn. Just one swap and your existing code is fully instrumented.
Install
pip install hilbertbench[qiskit,storage]
Open a tape
Wrap your script in with HilbertTape("runs/") as tape:
Swap the proxy
Replace StatevectorEstimator() with HilbertEstimatorProxy(tape)
Analyze
Call detect_barren_plateau(trace) — or any of the six analyzers