-
Notifications
You must be signed in to change notification settings - Fork 15
Caching and Trimming Errors per Detector in Color and BB Codes #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for better data locality Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
…ion-cpu-heuristic
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
caching Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Signed-off-by: Dragana Grbic <draganaurosgrbic@gmail.com>
Thank you @draganaurosgrbic for the plots and details.
or
However, the plotted accuracy here is based on |
@LalehB Thank you for your feedback, I will update the conclusion for those two. I intentionally left those two examples in the discussion, as they exhibited longer decoding times due to not being able to converge in the selected priority queue limit. I used a script that filters benchmarks with low accuracy impact, even those where there was no accuracy change at all. |
@LalehB I updated the PR description. |
Summary
This pull request introduces a new, optional, and experimental heuristic to the Tesseract decoder. The heuristic, named cache-and-trim-errors, is designed to explore the potential for significant decoding speedups by leveraging a newly discovered sparsity in the
get_detcost
function. The primary goal of this work is not to provide a universally applicable optimization, but rather to serve as a proof-of-concept for how much performance could be gained if we could intelligently identify the most relevant errors for a given detector, instead of exploring all of them.Motivation
During extensive performance benchmarking of the Tesseract decoder, I observed a consistent and interesting behavior in the
get_detcost
admissible heuristic. This function, which calculates the cost of a detector by aggregating the minimum cost of all relevant errors, was found to be a significant bottleneck for certain code families.My investigation revealed that for any given detector, only a small percentage of all potentially relevant errors actually contribute to its minimum cost during a decoding run. This "sparsity" of contributing errors was observed across all tested code families:
This consistent sparsity presents a significant optimization opportunity, especially for configurations with higher error rates and larger code distances where
get_detcost
consumes a larger portion of the total decoding time. The core question this work seeks to answer is: what is the performance ceiling for the decoder if we could effectively predict and prune non-contributing errors?Proposed Solution: The Cache-and-Trim-Errors Heuristic
To explore this question, I developed a new heuristic that modifies the
get_detcost
function. This heuristic is implemented as an optional, user-configurable strategy with two key steps:get_detcost
.get_detcost
for that detector will only operate on this smaller, cached set of errors.This approach serves as a simple, yet effective, experiment to measure the trade-off between speed and accuracy when a significant portion of the search space is aggressively pruned. This is a heuristic, and as such, it directly affects the decoder's accuracy by potentially removing paths to the globally optimal solution.
A more sophisticated approach to this problem, beyond the scope of this pull request, would be to gather extensive performance data from decoding runs and use it to train an AI/ML model. Such a model could potentially learn to predict the most likely contributing errors for a detector, eliminating the need to explore all of them and providing a more robust, non-heuristic solution.
User Flags
This heuristic can be enabled and configured using the following flags:
--cache-and-trim-detcost
: Enables the heuristic, replacing the standardget_detcost
function with the optimized version.--detcost-cache-threshold=<percentage>
: Specifies the percentage threshold (e.g., 10%, 20%) at which error trimming occurs for a detector.Caveats
It is crucial to understand that this heuristic is designed for specific performance-critical scenarios. It should not be used as a general-purpose optimization. Applying this heuristic to a code family or configuration that does not spend significant time in
get_detcost
will likely degrade performance or significantly impact accuracy.This is because the heuristic introduces an initial overhead for tracking and caching errors. This overhead can only be justified if the subsequent performance gains from trimming errors are substantial enough to compensate.
This heuristic is not effective for and should be avoided on:
get_detcost
time: The heuristic is only beneficial whenget_detcost
is a major bottleneck. For configurations with lower error rates and smaller code distances, or for code families like Transversal CNOT Protocols whereget_detcost
is not a dominant factor, the time saved by trimming errors is negligible.get_detcost
, as the trimming may occur before enough relevant errors have been discovered.The purpose of this heuristic is to serve as a tool for investigation and benchmarking on performance-demanding configurations (like Color Codes and Bivariate-Bicycle Codes), not as a default optimization for the decoder.
Performance and Accuracy Evaluation
This heuristic's impact was evaluated on code families where
get_detcost
is a primary bottleneck: Color Codes and Bivariate-Bicycle Codes. The data below summarizes the cumulative speedups and accuracy impacts for these code families, building on the performance gains from prior pull requests.Color Codes
Bivariate-Bicycle Codes
The graphs below provide a detailed visual breakdown of these results:
Key Contributions
get_detcost
function across various code families and configurations.