Skip to content

Optimization: Use dynamic array while evaluating circuit #1344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

JulianGCalderon
Copy link
Contributor

@JulianGCalderon JulianGCalderon commented Aug 1, 2025

Right know, we keep a vector of MLIR values while evaluating a circuit (one value per gate). This implies that we are using a lot of MLIR values at the same time, which puts pressure on the LLVM optimization phases.

This PR changes it to use a dynamic array of gate values, which uses a single register (a pointer to the array).

Minor additional optimization: Instead of converting gate values from integer to struct form in the circuit_eval libfunc, they are lazily converted in the get_output libfunc. This avoid converting gate values that will never be read.

EDIT: It seems that the "minor additional optimization" actually was causing the performance improvement. The main optimization of this PR doesn't achieve the expected results (there is no improvement). An additional PR was created with just the minor optimization: #1348

Benchmark

These benchmarks were executed in a Macbook M4 Pro.

I compiled several circuit-heavy contract classes, and compared the compilation time before and this PR. The contract classes were compiled with optimization level -O2.

Class Hash Before After Improvement
0x1b5fbe10... 42821 36116 1.19x
0x3100defc... 42059 35413 1.19x
0x79666cdb... 40039 33943 1.18x
0xa40ecaa0... 41878 35694 1.17x
0x4ffeed29... 50508 42166 1.20x
0x2269858a... 46784 39439 1.19x
0x5ff378cb... 42489 35645 1.19x
0x4edde37c... 42799 35626 1.20x

I benchmarked transactions involving these contracts, with optimization level -O3. There seems to be no significant performance change.

Tx Hash Before After Improvement
0x009e11b7... 32193015 31906187 1.01x
0x059c3787... 183447974 185201611 0.99x
0x066ea718... 183589631 186228986 0.99x
0x077c686a... 31906201 32040878 1.00x
0x0234af38... 186797651 187637530 1.00x
0x0624b433... 182445656 185180073 0.99x

Correctness

I replayed 25K blocks (1728032 transactions) and encountered no state diffs.

Checklist

  • Linked to Github Issue
  • Unit tests added
  • Integration tests added.
  • This change requires new documentation.
    • Documentation has been added/updated.

@codecov-commenter
Copy link

codecov-commenter commented Aug 1, 2025

Codecov Report

❌ Patch coverage is 97.48428% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.22%. Comparing base (21cc66c) to head (e0831ce).

Files with missing lines Patch % Lines
src/libfuncs/circuit.rs 97.43% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1344      +/-   ##
==========================================
+ Coverage   79.18%   79.22%   +0.03%     
==========================================
  Files         111      111              
  Lines       30441    30496      +55     
==========================================
+ Hits        24106    24159      +53     
- Misses       6335     6337       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@FrancoGiachetta FrancoGiachetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just small comments.

This would definitely require another PR but since the use of dynamic arrays are increasing, I think it would be nice to wrap its operations inside functions. For example, to update an element, we need to offset the pointer with a gep operation. Then we need to load the value and operate it. And, finally, we need to store it.
Perhaps having utility functions for these arrays would be better.

let circuit_length = block.const_int(
context,
location,
(1 + circuit_info.n_inputs + circuit_info.values.len()) * u384_layout.pad_to_align().size(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to use the repeat_layout function here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is equivalent. If you prefer I can change it to use repeat_layout instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use it to maintain a standard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 4293c49.

let circuit_input_length = block.const_int(
context,
location,
circuit_info.n_inputs * u384_layout.pad_to_align().size(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 4293c49.

DiegoCivi
DiegoCivi previously approved these changes Aug 4, 2025
Copy link
Contributor

@DiegoCivi DiegoCivi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! One minor question.

// Unknown values are represented as None
) -> Result<[&'this Block<'ctx>; 2]> {
// Throughout the evaluation of the circuit we maintain an MLIR dynamic
// array of gate values. It will contain 1 + N_INPUTS + N_GATES elements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to have that length? I saw that the first element is just a 1 but as far as I could understand we never use that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't manually use it, but some gate offsets implicitly refer to the first gate (with index 0), and expect it to have a value of 1.

/// This function consumes the `circuit_data` input variable, and frees the underlying memory.
///
/// The original Cairo hint evaluates all gates, even in case of failure.
/// This implementation exits on first error, as there is no need for the partial outputs yet.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there is no need for the partial outputs yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CircuitPartialOutputs type is a Cairo extern type, which means that its opaque and the user cannot access it fields directly. The only way to manipulate an extern type is through libfuncs.

There is currently no libfunc that receives that type. This means that from Cairo Native's point of view, its a noop type.

gabrielbosio
gabrielbosio previously approved these changes Aug 4, 2025
Copy link
Collaborator

@gabrielbosio gabrielbosio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one question

FrancoGiachetta
FrancoGiachetta previously approved these changes Aug 4, 2025
@gabrielbosio gabrielbosio enabled auto-merge August 5, 2025 16:10
@gabrielbosio
Copy link
Collaborator

Both Hyperfine and Compare Dumps are not running since get triggered on a PR to main only.

@JulianGCalderon JulianGCalderon marked this pull request as draft August 6, 2025 15:56
auto-merge was automatically disabled August 6, 2025 15:56

Pull request was converted to draft

@JulianGCalderon JulianGCalderon changed the title [Next] Use dynamic array while evaluating circuit Use dynamic array while evaluating circuit Aug 6, 2025
@JulianGCalderon JulianGCalderon changed the title Use dynamic array while evaluating circuit Optimization: Use dynamic array while evaluating circuit Aug 6, 2025
@JulianGCalderon JulianGCalderon changed the base branch from next to v0.6.x August 6, 2025 18:05
@JulianGCalderon JulianGCalderon changed the base branch from v0.6.x to main August 6, 2025 18:06
@JulianGCalderon JulianGCalderon dismissed stale reviews from gabrielbosio, DiegoCivi, and FrancoGiachetta August 6, 2025 18:06

The base branch was changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants