Caching and Trimming Errors per Detector in Color and BB Codes #46

draganaurosgrbic · 2025-07-09T18:17:25Z

Summary

This pull request introduces a new, optional, and experimental heuristic to the Tesseract decoder. The heuristic, named cache-and-trim-errors, is designed to explore the potential for significant decoding speedups by leveraging a newly discovered sparsity in the get_detcost function. The primary goal of this work is not to provide a universally applicable optimization, but rather to serve as a proof-of-concept for how much performance could be gained if we could intelligently identify the most relevant errors for a given detector, instead of exploring all of them.

Motivation

During extensive performance benchmarking of the Tesseract decoder, I observed a consistent and interesting behavior in the get_detcost admissible heuristic. This function, which calculates the cost of a detector by aggregating the minimum cost of all relevant errors, was found to be a significant bottleneck for certain code families.

My investigation revealed that for any given detector, only a small percentage of all potentially relevant errors actually contribute to its minimum cost during a decoding run. This "sparsity" of contributing errors was observed across all tested code families:

Code Family	Minimum %	Maximum %	Average %
Color Codes	4.90%	28.66%	10.57%
Bivariate-Bicycle Codes	3.79%	42.76%	10.13%
NLR5 Bivariate-Bicycle Codes	3.47%	34.97%	9.80%
NLR10 Bivariate-Bicycle Codes	2.94%	20.81%	6.71%
Surface Codes	8.70%	36.09%	13.00%
Transversal CNOT Protocols	4.07%	22.27%	11.17%

This consistent sparsity presents a significant optimization opportunity, especially for configurations with higher error rates and larger code distances where get_detcost consumes a larger portion of the total decoding time. The core question this work seeks to answer is: what is the performance ceiling for the decoder if we could effectively predict and prune non-contributing errors?

Proposed Solution: The Cache-and-Trim-Errors Heuristic

To explore this question, I developed a new heuristic that modifies the get_detcost function. This heuristic is implemented as an optional, user-configurable strategy with two key steps:

Caching: During the initial phase of a decoding run, for each detector, the decoder tracks and caches the specific errors that contribute to its minimum cost as computed by get_detcost.
Trimming: Once the number of cached errors for a detector reaches a pre-defined threshold, all other errors are permanently trimmed from consideration for that detector. Subsequent calls to get_detcost for that detector will only operate on this smaller, cached set of errors.

This approach serves as a simple, yet effective, experiment to measure the trade-off between speed and accuracy when a significant portion of the search space is aggressively pruned. This is a heuristic, and as such, it directly affects the decoder's accuracy by potentially removing paths to the globally optimal solution.

A more sophisticated approach to this problem, beyond the scope of this pull request, would be to gather extensive performance data from decoding runs and use it to train an AI/ML model. Such a model could potentially learn to predict the most likely contributing errors for a detector, eliminating the need to explore all of them and providing a more robust, non-heuristic solution.

User Flags

This heuristic can be enabled and configured using the following flags:

--cache-and-trim-detcost: Enables the heuristic, replacing the standard get_detcost function with the optimized version.
--detcost-cache-threshold=<percentage>: Specifies the percentage threshold (e.g., 10%, 20%) at which error trimming occurs for a detector.

Caveats

It is crucial to understand that this heuristic is designed for specific performance-critical scenarios. It should not be used as a general-purpose optimization. Applying this heuristic to a code family or configuration that does not spend significant time in get_detcost will likely degrade performance or significantly impact accuracy.

This is because the heuristic introduces an initial overhead for tracking and caching errors. This overhead can only be justified if the subsequent performance gains from trimming errors are substantial enough to compensate.

This heuristic is not effective for and should be avoided on:

Code families that already execute fast: For codes like Surface Codes, which are already decoded quickly, the initial caching overhead will likely outweigh any minimal gains, leading to a net degradation in performance.
Configurations with low get_detcost time: The heuristic is only beneficial when get_detcost is a major bottleneck. For configurations with lower error rates and smaller code distances, or for code families like Transversal CNOT Protocols where get_detcost is not a dominant factor, the time saved by trimming errors is negligible.
Configurations where accuracy is paramount: Because this is a heuristic, it fundamentally alters the A* search path. Trimming errors prematurely can remove the true optimal path, leading to a less accurate result. This risk is particularly high for configurations where the decoder doesn't spend a lot of time in get_detcost, as the trimming may occur before enough relevant errors have been discovered.

The purpose of this heuristic is to serve as a tool for investigation and benchmarking on performance-demanding configurations (like Color Codes and Bivariate-Bicycle Codes), not as a default optimization for the decoder.

Performance and Accuracy Evaluation

This heuristic's impact was evaluated on code families where get_detcost is a primary bottleneck: Color Codes and Bivariate-Bicycle Codes. The data below summarizes the cumulative speedups and accuracy impacts for these code families, building on the performance gains from prior pull requests.

Color Codes

p=0.001
- 20% threshold: Achieved 45.5-60.3% cumulative speedup. Accuracy was largely unaffected.
- 10% threshold: Achieved 49.5-63.3% cumulative speedup. Accuracy degradation was minimal, with only one configuration showing a notable increase in low-confidence results (r=11, 45 to 79).
p=0.002
- 20% threshold: Achieved 56.8-60.6% cumulative speedup. A few configurations showed minor accuracy changes (~10 more low-confidence results).
- 10% threshold: Achieved 62.5-66.5% cumulative speedup. Accuracy degradation was noticeable for the r=11 configuration (646 to 730 low-confidence results).

Bivariate-Bicycle Codes

p=0.001
- 20% threshold: Achieved 46.3-73.6% cumulative speedup. Worst-case accuracy degradation was ~20 more low-confidence results.
- 10% threshold: Achieved 49.5-82.3% cumulative speedup. Worst-case accuracy degradation reached ~60 more low-confidence results.
p=0.002
- 20% threshold: Achieved 56.9-82.5% cumulative speedup. Accuracy impact was generally lower than at p=0.001.
- 10% threshold: Achieved 63.6-84.5% cumulative speedup. Accuracy impact was generally lower than at p=0.001.

The graphs below provide a detailed visual breakdown of these results:

Key Contributions

Identified and Characterized Error Sparsity: Discovered and quantified a consistent sparsity in the errors that affect the get_detcost function across various code families and configurations.
Developed a Novel Heuristic: Designed and implemented the cache-and-trim-errors heuristic as an experimental tool to evaluate the potential for significant performance gains by pruning the search space.
Comprehensive Benchmarking: Conducted extensive experiments to validate the heuristic's effectiveness and its trade-off between speed and decoding accuracy, especially for high-cost configurations.
Demonstrated Significant Speedups: Achieved cumulative performance gains of up to 84.5% for specific code families at higher error rates and larger code distances, while maintaining robust accuracy.
Provided a Foundation for Future Work: This heuristic and its benchmarking data provide valuable insights and a starting point for more advanced solutions, such as using AI/ML to predict contributing errors.

for better data locality Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

…eract-decoder into optimization-cpu

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

…eract-decoder into optimization-cpu

…ion-cpu-heuristic

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

caching Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

LalehB · 2025-07-15T00:18:09Z

Thank you @draganaurosgrbic for the plots and details.
I think we need to be careful about these benchmarks and how we are drawing conclusions from them, such as:

"Bivariate Bicycle Code (NLR5WB, r=10, d=10, p=0.002, 10% threshold): This was an outstanding case, demonstrating over 110% speedup (2.1x faster) with absolutely no change in low confidence results and errors results."

or

"Bivariate Bicycle Code (NLR10WB, r=10, d=10, p=0.002, 10% threshold): Delivered a 93% speedup (1.93x faster) with no change in low confidence results/errors results."

However, the plotted accuracy here is based on low confidence + error count, which equals 500 in both examples—while the total number of shots in each benchmark is also 500. This means the decoder failed to decode any of the shots. Because of this, the accuracy is already at its worst and cannot get any worse regardless of what heuristic is used.
So, we cannot conclude that the heuristic had no impact on accuracy for this examples —since the baseline accuracy was already maximally bad in these benchmarks.

draganaurosgrbic · 2025-07-15T00:22:40Z

@LalehB Thank you for your feedback, I will update the conclusion for those two. I intentionally left those two examples in the discussion, as they exhibited longer decoding times due to not being able to converge in the selected priority queue limit. I used a script that filters benchmarks with low accuracy impact, even those where there was no accuracy change at all.

draganaurosgrbic · 2025-07-15T00:24:46Z

@LalehB I updated the PR description.

draganaurosgrbic · 2025-07-21T16:38:37Z

@LalehB @noajshu The PR description is updated, including comprehensive data for analyzing the impact of the flag on performance and accuracy, summary list that includes major contributions, etc. If you run into any issues, please let me know.

draganaurosgrbic and others added 30 commits June 14, 2025 14:52

Packing blocked errors and detection counts into a single array/struct

0a24685

for better data locality Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Remove/refactor redundant code

779e7ac

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Minor changes

eff2a15

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Merge remote-tracking branch 'origin/main' into optimization-cpu

4003520

Control the number of nodes inserted into priority queue

144affe

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

fix detector beam node skipping implementation

02c1066

beam search pruning based on min cost per detector count

6f4b5dd

Merge branch 'optimization-cpu' of https://github.com/quantumlib/tess…

e0d6f2f

…eract-decoder into optimization-cpu

Merge remote-tracking branch 'origin/main' into optimization-cpu

b16bc51

Format src/tesseract.cc file

cfd157a

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Merge branch 'main' into optimization-cpu

97b78eb

New heuristic for speeding up 'get_detcost'

ccca8b1

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Remove unnecessary code

59b73f1

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Merge remote-tracking branch 'origin/main' into optimization-cpu

bb2ac2b

Merge branch 'optimization-cpu' of https://github.com/quantumlib/tess…

d6a5659

…eract-decoder into optimization-cpu

Merge remote-tracking branch 'origin/optimization-cpu' into optimizat…

8a9a8e5

…ion-cpu-heuristic

Caching and trimming errors from 'get_detcost' function

b460f7f

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Format tesseract files

a78aee1

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Minnor changes

db78570

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Update 'get_detcost' caching logic

3df14bc

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Abstract interface for computing detections cost with and without

a6dc671

caching Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Minor change

746273a

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Minor change

22a3e3d

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>

Update pipeline with specific version of clang-format

02808c5

Minor chnage

ef8ffdd

Update CI file

b1933ac

Update CI file

486da10

Minor changes

e2eaed6

Fix formatting

1714bf2

Fix DetectorCostCalculator Destructor

3fae81c

Fix formatting

659963e

draganaurosgrbic changed the title ~~Caching and trimming errors from 'get_detcost' function~~ Caching and trimming errors per detector Jul 14, 2025

draganaurosgrbic requested a review from LalehB July 14, 2025 23:56

draganaurosgrbic changed the title ~~Caching and trimming errors per detector~~ Caching and trimming errors per detector in Color and BB codes Jul 17, 2025

Merge branch 'main' into optimization-cpu-heuristic

cc6e037

draganaurosgrbic changed the title ~~Caching and trimming errors per detector in Color and BB codes~~ Caching and Trimming Errors per Detector in Color and BB Codes Jul 20, 2025

draganaurosgrbic requested a review from noajshu July 20, 2025 05:09

Merge branch 'main' into optimization-cpu-heuristic

5800dd5

draganaurosgrbic added 4 commits July 30, 2025 13:11

Merge branch 'main' into optimization-cpu-heuristic

19d2498

fix bug

cd0f8a3

fix clang format

cfdd5c5

Merge branch 'main' into optimization-cpu-heuristic

5f98e0c

draganaurosgrbic removed the request for review from noajshu August 3, 2025 10:54

Merge branch 'main' into optimization-cpu-heuristic

fe1fc74

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Caching and Trimming Errors per Detector in Color and BB Codes #46

Caching and Trimming Errors per Detector in Color and BB Codes #46

draganaurosgrbic commented Jul 9, 2025 •

edited

Loading

Uh oh!

LalehB commented Jul 15, 2025

Uh oh!

draganaurosgrbic commented Jul 15, 2025 •

edited

Loading

Uh oh!

draganaurosgrbic commented Jul 15, 2025 •

edited

Loading

Uh oh!

draganaurosgrbic commented Jul 21, 2025

Uh oh!

Uh oh!

Caching and Trimming Errors per Detector in Color and BB Codes #46

Are you sure you want to change the base?

Caching and Trimming Errors per Detector in Color and BB Codes #46

Conversation

draganaurosgrbic commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Proposed Solution: The Cache-and-Trim-Errors Heuristic

User Flags

Caveats

Performance and Accuracy Evaluation

Color Codes

Bivariate-Bicycle Codes

Key Contributions

Uh oh!

LalehB commented Jul 15, 2025

Uh oh!

draganaurosgrbic commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

draganaurosgrbic commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

draganaurosgrbic commented Jul 21, 2025

Uh oh!

Uh oh!

draganaurosgrbic commented Jul 9, 2025 •

edited

Loading

draganaurosgrbic commented Jul 15, 2025 •

edited

Loading

draganaurosgrbic commented Jul 15, 2025 •

edited

Loading