Skip to content

Caching and Trimming Errors per Detector in Color and BB Codes #46

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

draganaurosgrbic
Copy link
Contributor

@draganaurosgrbic draganaurosgrbic commented Jul 9, 2025

Summary

This pull request introduces a new, optional, and experimental heuristic to the Tesseract decoder. The heuristic, named cache-and-trim-errors, is designed to explore the potential for significant decoding speedups by leveraging a newly discovered sparsity in the get_detcost function. The primary goal of this work is not to provide a universally applicable optimization, but rather to serve as a proof-of-concept for how much performance could be gained if we could intelligently identify the most relevant errors for a given detector, instead of exploring all of them.


Motivation

During extensive performance benchmarking of the Tesseract decoder, I observed a consistent and interesting behavior in the get_detcost admissible heuristic. This function, which calculates the cost of a detector by aggregating the minimum cost of all relevant errors, was found to be a significant bottleneck for certain code families.

My investigation revealed that for any given detector, only a small percentage of all potentially relevant errors actually contribute to its minimum cost during a decoding run. This "sparsity" of contributing errors was observed across all tested code families:

Code Family Minimum % Maximum % Average %
Color Codes 4.90% 28.66% 10.57%
Bivariate-Bicycle Codes 3.79% 42.76% 10.13%
NLR5 Bivariate-Bicycle Codes 3.47% 34.97% 9.80%
NLR10 Bivariate-Bicycle Codes 2.94% 20.81% 6.71%
Surface Codes 8.70% 36.09% 13.00%
Transversal CNOT Protocols 4.07% 22.27% 11.17%

This consistent sparsity presents a significant optimization opportunity, especially for configurations with higher error rates and larger code distances where get_detcost consumes a larger portion of the total decoding time. The core question this work seeks to answer is: what is the performance ceiling for the decoder if we could effectively predict and prune non-contributing errors?


Proposed Solution: The Cache-and-Trim-Errors Heuristic

To explore this question, I developed a new heuristic that modifies the get_detcost function. This heuristic is implemented as an optional, user-configurable strategy with two key steps:

  1. Caching: During the initial phase of a decoding run, for each detector, the decoder tracks and caches the specific errors that contribute to its minimum cost as computed by get_detcost.
  2. Trimming: Once the number of cached errors for a detector reaches a pre-defined threshold, all other errors are permanently trimmed from consideration for that detector. Subsequent calls to get_detcost for that detector will only operate on this smaller, cached set of errors.

This approach serves as a simple, yet effective, experiment to measure the trade-off between speed and accuracy when a significant portion of the search space is aggressively pruned. This is a heuristic, and as such, it directly affects the decoder's accuracy by potentially removing paths to the globally optimal solution.

A more sophisticated approach to this problem, beyond the scope of this pull request, would be to gather extensive performance data from decoding runs and use it to train an AI/ML model. Such a model could potentially learn to predict the most likely contributing errors for a detector, eliminating the need to explore all of them and providing a more robust, non-heuristic solution.


User Flags

This heuristic can be enabled and configured using the following flags:

  • --cache-and-trim-detcost: Enables the heuristic, replacing the standard get_detcost function with the optimized version.
  • --detcost-cache-threshold=<percentage>: Specifies the percentage threshold (e.g., 10%, 20%) at which error trimming occurs for a detector.

Caveats

It is crucial to understand that this heuristic is designed for specific performance-critical scenarios. It should not be used as a general-purpose optimization. Applying this heuristic to a code family or configuration that does not spend significant time in get_detcost will likely degrade performance or significantly impact accuracy.

This is because the heuristic introduces an initial overhead for tracking and caching errors. This overhead can only be justified if the subsequent performance gains from trimming errors are substantial enough to compensate.

This heuristic is not effective for and should be avoided on:

  • Code families that already execute fast: For codes like Surface Codes, which are already decoded quickly, the initial caching overhead will likely outweigh any minimal gains, leading to a net degradation in performance.
  • Configurations with low get_detcost time: The heuristic is only beneficial when get_detcost is a major bottleneck. For configurations with lower error rates and smaller code distances, or for code families like Transversal CNOT Protocols where get_detcost is not a dominant factor, the time saved by trimming errors is negligible.
  • Configurations where accuracy is paramount: Because this is a heuristic, it fundamentally alters the A* search path. Trimming errors prematurely can remove the true optimal path, leading to a less accurate result. This risk is particularly high for configurations where the decoder doesn't spend a lot of time in get_detcost, as the trimming may occur before enough relevant errors have been discovered.

The purpose of this heuristic is to serve as a tool for investigation and benchmarking on performance-demanding configurations (like Color Codes and Bivariate-Bicycle Codes), not as a default optimization for the decoder.


Performance and Accuracy Evaluation

This heuristic's impact was evaluated on code families where get_detcost is a primary bottleneck: Color Codes and Bivariate-Bicycle Codes. The data below summarizes the cumulative speedups and accuracy impacts for these code families, building on the performance gains from prior pull requests.

Color Codes

  • p=0.001
    • 20% threshold: Achieved 45.5-60.3% cumulative speedup. Accuracy was largely unaffected.
    • 10% threshold: Achieved 49.5-63.3% cumulative speedup. Accuracy degradation was minimal, with only one configuration showing a notable increase in low-confidence results (r=11, 45 to 79).
  • p=0.002
    • 20% threshold: Achieved 56.8-60.6% cumulative speedup. A few configurations showed minor accuracy changes (~10 more low-confidence results).
    • 10% threshold: Achieved 62.5-66.5% cumulative speedup. Accuracy degradation was noticeable for the r=11 configuration (646 to 730 low-confidence results).

Bivariate-Bicycle Codes

  • p=0.001
    • 20% threshold: Achieved 46.3-73.6% cumulative speedup. Worst-case accuracy degradation was ~20 more low-confidence results.
    • 10% threshold: Achieved 49.5-82.3% cumulative speedup. Worst-case accuracy degradation reached ~60 more low-confidence results.
  • p=0.002
    • 20% threshold: Achieved 56.9-82.5% cumulative speedup. Accuracy impact was generally lower than at p=0.001.
    • 10% threshold: Achieved 63.6-84.5% cumulative speedup. Accuracy impact was generally lower than at p=0.001.

The graphs below provide a detailed visual breakdown of these results:

img1 img2 img3 img4

Key Contributions

  • Identified and Characterized Error Sparsity: Discovered and quantified a consistent sparsity in the errors that affect the get_detcost function across various code families and configurations.
  • Developed a Novel Heuristic: Designed and implemented the cache-and-trim-errors heuristic as an experimental tool to evaluate the potential for significant performance gains by pruning the search space.
  • Comprehensive Benchmarking: Conducted extensive experiments to validate the heuristic's effectiveness and its trade-off between speed and decoding accuracy, especially for high-cost configurations.
  • Demonstrated Significant Speedups: Achieved cumulative performance gains of up to 84.5% for specific code families at higher error rates and larger code distances, while maintaining robust accuracy.
  • Provided a Foundation for Future Work: This heuristic and its benchmarking data provide valuable insights and a starting point for more advanced solutions, such as using AI/ML to predict contributing errors.

draganaurosgrbic and others added 30 commits June 14, 2025 14:52
for better data locality

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
caching

Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
@draganaurosgrbic draganaurosgrbic changed the title Caching and trimming errors from 'get_detcost' function Caching and trimming errors per detector Jul 14, 2025
@draganaurosgrbic draganaurosgrbic requested a review from LalehB July 14, 2025 23:56
@LalehB
Copy link
Collaborator

LalehB commented Jul 15, 2025

Thank you @draganaurosgrbic for the plots and details.
I think we need to be careful about these benchmarks and how we are drawing conclusions from them, such as:

  • "Bivariate Bicycle Code (NLR5WB, r=10, d=10, p=0.002, 10% threshold): This was an outstanding case, demonstrating over 110% speedup (2.1x faster) with absolutely no change in low confidence results and errors results."

or

  • "Bivariate Bicycle Code (NLR10WB, r=10, d=10, p=0.002, 10% threshold): Delivered a 93% speedup (1.93x faster) with no change in low confidence results/errors results."

However, the plotted accuracy here is based on low confidence + error count, which equals 500 in both examples—while the total number of shots in each benchmark is also 500. This means the decoder failed to decode any of the shots. Because of this, the accuracy is already at its worst and cannot get any worse regardless of what heuristic is used.
So, we cannot conclude that the heuristic had no impact on accuracy for this examples —since the baseline accuracy was already maximally bad in these benchmarks.

@draganaurosgrbic
Copy link
Contributor Author

draganaurosgrbic commented Jul 15, 2025

@LalehB Thank you for your feedback, I will update the conclusion for those two. I intentionally left those two examples in the discussion, as they exhibited longer decoding times due to not being able to converge in the selected priority queue limit. I used a script that filters benchmarks with low accuracy impact, even those where there was no accuracy change at all.

@draganaurosgrbic
Copy link
Contributor Author

draganaurosgrbic commented Jul 15, 2025

@LalehB I updated the PR description.

@draganaurosgrbic draganaurosgrbic changed the title Caching and trimming errors per detector Caching and trimming errors per detector in Color and BB codes Jul 17, 2025
@draganaurosgrbic draganaurosgrbic changed the title Caching and trimming errors per detector in Color and BB codes Caching and Trimming Errors per Detector in Color and BB Codes Jul 20, 2025
@draganaurosgrbic draganaurosgrbic requested a review from noajshu July 20, 2025 05:09
@draganaurosgrbic
Copy link
Contributor Author

@LalehB @noajshu The PR description is updated, including comprehensive data for analyzing the impact of the flag on performance and accuracy, summary list that includes major contributions, etc. If you run into any issues, please let me know.

@draganaurosgrbic draganaurosgrbic removed the request for review from noajshu August 3, 2025 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants