Skip to content

Hash to curve as defined in the standard #377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Jul 5, 2025

Conversation

iquerejeta
Copy link
Contributor

The current implementation is not compatible with the current definition of the standard. This PR provides a hash-to-curve implementation as defined in draft-irtf-cfrg-hash-to-curve-12.

  • Implementation of hash_to_field as defined in the standard
  • Implementation of hash_to_curve as defined in the standard, by changing the mechanism over which we chose the sign.
  • For the point above, had to change the elligator_encode to return whether eps is a square or not (required for hash_to_curve).
  • Included test vectors of the draft.
  • Included FieldElement::from_bytes_wide(bytes: &u8; 64]) to reduce integers encoded in 64 bytes.

@iquerejeta
Copy link
Contributor Author

iquerejeta commented Dec 7, 2022

I see in #438 that the function hash_to_bytes was deprecated. Is there any chance to include the hash_to_curve as defined in the standard? If there is the possibility of merging this PR, I'll check that the implemented version corresponds with the current version of the informational draft.
cc: @rozbb

@rozbb
Copy link
Contributor

rozbb commented Dec 7, 2022

Yes, thank u for following up. I do want to merge this but I haven't gotten a chance to do a proper review. We really need to get to a 4.0 release, so I think this might have to wait until 4.1. Thank you so much for this contribution. I promise it will get the attention it deserves.

@iquerejeta
Copy link
Contributor Author

Great, thanks! And whenever you think you have time, you can ping me here, and I'll rebase (as there's been some changes in the hashing functions) and make sure it follows the latest version of the standard 👍 Thanks for working on this 🙏

@tarcieri
Copy link
Contributor

tarcieri commented Dec 8, 2022

It would also be interesting if someone could attempt to impl the hash2curve traits from the elliptic-curve crate i.e. as an optional dependency

@iquerejeta
Copy link
Contributor Author

Happy to do it 👍 given that there is no rush, I should be able to dedicate a few cycles in a reasonable amount of time.

@paweljakubas
Copy link

hi @rozbb @tarcieri Any chance we can proceed with this one? Or you could identify what should be done to make that happen? I might help if you identify things that would make this PR land into master :-) Cheers! Pawel

@tarcieri
Copy link
Contributor

Well for starters, it's full of merge conflicts

@iquerejeta
Copy link
Contributor Author

Rebase done 👍 Happy to make any further changes if necessary.

@paweljakubas
Copy link

paweljakubas commented Apr 28, 2025

Thank you very much @iquerejeta (I have also started rebasing your PR on Fri but you were quicker 😄 )
@tarcieri are we fine now? Thanks in advance for your help and time spent here and cheers 👋

@paweljakubas
Copy link

paweljakubas commented Apr 30, 2025

@tarcieri @rozbb @oleganza can we do with @iquerejeta something more ?
Thanks in advance for your time and please do not be angry with me for my pinging 🙏

We are working on VRF library based on dalek-cryptography and basing atm on this branch and we would like to have just master as dependency rather than branch. And also humbly thinking this is a good addition 😄

Cheers!

@abailly
Copy link

abailly commented May 8, 2025

Apologies for pestering you @tarcieri but we (Amaru team) ultimately depend on this PR to be included in a proper version of curve255159-dalek in order to simplify our dependencies on VRF library and also to be able to request a proper audit on a stable code base.
Is there anything we can help on to move this forward and cut a release in the near future?

@tarcieri
Copy link
Contributor

tarcieri commented May 8, 2025

@abailly you should probably ask @rozbb. I'm working on a lot of prerequisites to the next releases and don't have time to review this, sorry

@abailly
Copy link

abailly commented May 9, 2025

Thanks a lot @tarcieri, waiting for @rozbb's feedback then 🙏

@tarcieri
Copy link
Contributor

tarcieri commented May 9, 2025

I'll also note this looks fine to me at first glance, but I'd prefer one of us go through it in detail

Copy link
Contributor

@rozbb rozbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi hi! Thank you so much for this. This overlapped with #612, which seems like a superset of this, and I was hoping it'd go faster. But it's not and that's fine!

I left a few comments here and there, but overall looks great!

.expect("Montgomery conversion to Edwards point in Elligator failed");

// Now we recover v, to ensure that we got the sign right.
let mont_v =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe silly question: where do these steps come from precisely? I'm following appendix F.3 and it looks like encode_elligator2 is doing up to step 11. So this is doing step 19 by recomputing some values? Why precisely is mont_v set this way if it's supposed to be sqrt(y2) per step 17?

Sorry, I'm not the most familiar with these algorithms because I didn't write the original version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few ways this is done. In libsodium, for example, they compute v via a square root, pick the sign accordingly, and then convert to Edwards using that sign (see here).

What we do is different: note that is_square tells us the sign of the Montgomery v coordinate, not the Edwards X. However, the to_edwards function only lets us specify the sign of X, so we:

  • Call to_edwards with an arbitrary sign (in our case, 0),
  • Recompute v from the u and the Edwards X using the mapping formula (recall that X = ED25519_SQRTAM2 * u / v)
  • Then flip the sign of X if the resulting sign of v doesn't match is_sq ^ v.is_negative().

This avoids using square roots and works with our existing conversion interface (the current interface would not allow us to do it the square root way, because there is no Montgomery -> Edwards conversion that takes both (u, v) as input). We could add code that enables it—happy to benchmark it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my brain is a bit fried but I'll try to understand better. To be clear, my absolute top concern is maintainability. It would be way easier for me if all the formulas in this PR used notation directly from the spec. Do you think that'd be possible? I don't mind some added code (eg the conversion method you mentioned) to get there. I'll even accept a performance hit if necessary. I think @tarcieri would agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to do that. I'm wondering if it makes sense to change the elligator_encode function directly, and perform elligator as specified in the standard. What I have in mind is that we implement the straight line implementation they present in the standard (here. And then we have two functions, an elligator2 for montgomery, and an elligator2 for edwards. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds wonderful, thank you!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started trying to do this, and there's some functionality missing. The only pow implementation is pow2k. Following the inline implementation we need to compute

y11 = tv2^c4

where

c4 = (q - 5) / 8

with c4 not being a power of 2 (see step 16 here). Don't know if I'm missing a trick here. I think at this point it would be easier to properly document, rather than implementing new functions. If you are happy with that path I can add proper documentation to the function, and leave it as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Yeah I think it'd be fine to document the parts that deviate. In my mind, the bar is to have it "so someone with a a highschool math background can see that these routines are clearly the same". It's kinda vague, but I'm sure we can converge on something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment. I decided to link this section, instead of the inline section, as I believe its easier to follow along. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can follow that, ty

Copy link
Contributor Author

@iquerejeta iquerejeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to refresh my memory with the first question!

@abailly
Copy link

abailly commented May 23, 2025

@rozbb Would it be possible to get approval to run workflows so that we can check CI is passing?

{
let len_in_bytes = 48;
let l_i_b_str = [0u8, len_in_bytes];
let z_pad = [0u8; 128];
Copy link
Contributor

@daxpedda daxpedda May 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec says:

Z_pad = I2OSP(0, s_in_bytes)

Where s_in_bytes is:

- s_in_bytes, the input block size of H, measured in bytes (see
  discussion above). For example, for SHA-256, s_in_bytes = 64.

128 is correct for SHA-512, but it isn't for any arbitrary hash. E.g. plugging in SHA3-512 would work but produce non-spec compliant results.

I've no idea how to solve this here without generic-array or hybrid-array:

Suggested change
let z_pad = [0u8; 128];
let z_pad = [0u8; { D::BlockSize::USIZE }];

This doesn't work AFAIK.

Alternatively D could be restricted to BlockSizeUser<BlockSize = U128>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone the BlockSizeUser way. Let me know if that works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have used a Vec, which works but now the crate feature digest requires alloc.

I've just realized that digest already depends on generic-array and it also re-exports it. So you can actually use generic-array here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we do handle BlockSizeUser correctly, we also need to enforce further requirements from the spec:

For correctness, H requires b <= s.

This can be done via:

BlockSizeUser<BlockSize: Greater<D::OutputSize, Output = True>>

@tarcieri
Copy link
Contributor

@iquerejeta there are some deprecations you'll need to fix

@iquerejeta
Copy link
Contributor Author

My bad 🙏 forgot about those in the rebase.

@rozbb
Copy link
Contributor

rozbb commented Jun 26, 2025

Thank you, and sorry for the delay. Will make time for this tomorrow


#[test]
#[cfg(feature = "digest")]
fn from_hash_wide() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last question I promise: Where are these test vectors from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit that I do not remember. It's been a while. I was looking for those test vectors in the libsodium implementation or some drafts (which were the most likely sources) but couldn't find any.

I'm happy to regenerate them from any source you'd like and document it accordingly.

@rozbb
Copy link
Contributor

rozbb commented Jun 27, 2025

Ok, just cleaned up docs and clarified the origin of some KATs. I left just 1 question, namely where the from_bytes_wide test vectors came from. Other than that, I think we're good to go!

@rozbb
Copy link
Contributor

rozbb commented Jun 28, 2025

This is confusing to me. It seems the fiat backend doesn't have a FieldElement::from_bytes_wide function, which is necessary for hash_to_field. For some reason, this compiled in the past. Why?

@iquerejeta
Copy link
Contributor Author

What should we do about the fiat backend? The logic of the from_bytes_wide is not from fiat crypto 🤔

@rozbb
Copy link
Contributor

rozbb commented Jun 30, 2025

@iquerejeta Well my question is: what was it doing when the build was succeeding? Shouldn’t the fiat build have always been failing?

@iquerejeta
Copy link
Contributor Author

Yeah, that's a good question. I have no idea why. I believe we never had an implementation of from_bytes_wide for fiat.

@tarcieri
Copy link
Contributor

tarcieri commented Jul 4, 2025

Regarding fiat-crypto and wide reductions, this might be helpful: https://words.filippo.io/wide-reduction/

@rozbb
Copy link
Contributor

rozbb commented Jul 5, 2025

Thank you @tarcieri, that gave me the idea to make from_bytes_wide generic over any field element, rather than over specific implementations. This simplifies the implementation, makes it work for fiat, and also does not detectibly slow down hash_to_curve on my machine.

Do you wanna do one last look-over? I had to bump the fiat-crypto dependency because we needed fiat_25519_from_bytes to be a const fn, which it previously wasn't. I assume that's fine, since we don't export anything from there.

Copy link
Contributor

@tarcieri tarcieri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rozbb rozbb merged commit 25a9dbb into dalek-cryptography:main Jul 5, 2025
23 checks passed
@rozbb
Copy link
Contributor

rozbb commented Jul 5, 2025

Thank you so much @iquerejeta @tarcieri and @daxpedda !!

@abailly
Copy link

abailly commented Jul 5, 2025

🎉 Thank you everyone! Looking forward to next release so that we can release our stuff too :)

@iquerejeta
Copy link
Contributor Author

🎉 Thanks everyone! 🙏

rozbb pushed a commit that referenced this pull request Jul 7, 2025
* Implementation of `hash_to_field` as defined in the standard
* Implementation of `hash_to_curve` as defined in the standard, by changing the mechanism over which we chose the sign.
* For the point above, had to change the `elligator_encode` to return whether `eps` is a square or not (required for `hash_to_curve`).
* Included test vectors of the draft.
* Included `FieldElement::from_bytes_wide(bytes: &u8; 64])` to reduce integers encoded in 64 bytes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants