Skip to content

repr(ordered_fields) #3845

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

RustyYato
Copy link

@RustyYato RustyYato commented Aug 5, 2025

Add repr(ordered_fields) and provide a migration path to switch users from repr(C) to repr(ordered_fields), then change the meaning of repr(C) in the next edition.

This RFC is meant to be an MVP, and any extensions (for example, adding more reprs) are not in scope. This is done to make it as easy as possible to accept this RFC and make progress on the issue of repr(C) serving two opposing roles.

Rendered

To avoid endless bikeshedding, I'll make a poll if this RFC is accepted with all the potential names for the new repr. If you have a new name, I'll add it to the list of names in the unresolved questions section, and will include it in the poll.

@clarfonthey
Copy link

clarfonthey commented Aug 5, 2025

Not to add too many extra colours to the list, but repr(consistent) feels like a good name for this, since the purpose is to provide a consistent layout that does not depend on generics, compiler version, or target. The important thing is just that it's consistent, not that it matches what C does.

(Note: those three things should cover every case I've seen that uses repr(C) that should use repr(ordered_fields), but please feel free to correct me if I missed anything.)

Whereas repr(C) is explicitly, match what C does.

Also, while it may be more technical than most users need to understand, it would be helpful if the RFC reiterated the current issues with repr(C) that we want to fix, and potential future differences between repr(C) and repr(ordered_fields) that could pop up. I've read some of them but am not 100% sure of the details, and it would be nice to keep as part of the RFC.

@Lokathor
Copy link
Contributor

Lokathor commented Aug 5, 2025

Just as a small point of style the Guide Level Explanation is usually "what would be written in the rust tutorial book", and the Reference Level Explanation is "what could be written into the Rust Reference". This isn't a strict requirement, but personally I'd like to see the Reference Level part written out. Using the present tense, as if the RFC was accepted and implemented.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout.
`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in-memory layout.

"cross-platform" -- the layout will differ when there are different layouts for struct members' types, in particular primitive types can have different alignments which changes the amount of padding.

e.g., #[repr(ordered_fields)] struct S(u8, f64); doesn't have the same layout on x86_64 and i686

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, this will need to be documented as a hazard in the ordered_fields docs. However, the repr itself will be cross-platform. For example, #[repr(ordered_fields)] struct Cross([u8; 3], SomeEnum); will be truly cross-platform (given that SomeEnum is!).

RustyYato and others added 3 commits August 5, 2025 16:18
@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Aug 6, 2025
@moonheart08
Copy link

Not to add too many extra colours to the list, but repr(consistent) feels like a good name for this, since the purpose is to provide a consistent layout that does not depend on generics, compiler version, or target. The important thing is just that it's consistent, not that it matches what C does.

(Note: those three things should cover every case I've seen that uses repr(C) that should use repr(ordered_fields), but please feel free to correct me if I missed anything.)

Whereas repr(C) is explicitly, match what C does.

Also, while it may be more technical than most users need to understand, it would be helpful if the RFC reiterated the current issues with repr(C) that we want to fix, and potential future differences between repr(C) and repr(ordered_fields) that could pop up. I've read some of them but am not 100% sure of the details, and it would be nice to keep as part of the RFC.

Just voicing support for repr(consistent) as naming.
Aside from the above, it more clearly hones in on the primary promises of the RFC, which is not just ordering but also exact type representation for things like enums. Field ordering is not the only thing it promises.

@joshtriplett joshtriplett added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Aug 6, 2025
@joshtriplett
Copy link
Member

Nominating this so that we can do a preliminary vibe-check on it in a lang triage meeting.

Comment on lines +14 to +16
Currently `repr(C)` serves two roles
1. Provide a consistent, cross-platform, predictable layout for a given type
2. Match the target C compiler's struct/union layout algorithm and ABI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big fan of doing this split, especially for structs. (It's less obvious what choices to make for other things, IMHO, but at least for structs this is something I've wanted for ages, so that for example Layout::extend can talk about it instead of C.)

Pondering the bikeshed: declaration_order or something could also be used to directly say what you're getting.

(This could be contrasted with other potential reprs that I wouldn't expect this RFC to add, but could consider as future work, like a deterministic_by_size_and_alignment where some restricted set of optimizations are allowed but you can be sure that usize and NonNull<String> can be mixed between different types while still getting the "same" field offsets, for example.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is also useful for unions, so we don't need to rely on repr(C) to ensure that all fields of a union are at offset 0.

This could be contrasted with other potential reprs that I wouldn't expect this RFC to add...

This also works as an argument against names like repr(consistent), since there are multiple consistent and useful repr, making it not descriptive enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that declaration_order or ordered_fields is a bit weird on a union, because of course they're not really in any "order".

It makes me ponder whether we should just have repr(offset_zero) for unions to be explicit about it, or something.

(Which makes me think of other things like addressing rust-lang/unsafe-code-guidelines#494 by having a different constructs for "bag of maybeuninit stuff that overlap" vs "distinct options with an active-variant rule for enum-but-without-stored-discriminant". But those are definitely not this RFC.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind spelling it as repr(offset_zero) for unions if that helps get this RFC accepted 😄. However, I have a sneaking suspicion that this isn't the contentious part of this RFC.
I know the name isn't optimal (intentionally). This can be hashed out after the RFC is accepted (or even give a different name for all of struct, union, and enum).
The most important bit for me is just that we do the split (for all of struct, union, and enum, to be consistent).

@RustyYato
Copy link
Author

I've updated how enums's tags are specified, now they just defer to whatever repr(C)'s tag type is. This is done to reduce the friction of switching from repr(C) to repr(ordered_fields). To ensure that all uses of repr(ordered_fields) can be cross-platform, I've adding a lint to ensure that the user also adds an explicit repr for repr(uN)/repr(iN).


`repr(C)` in edition <= 2024 is an alias for `repr(ordered_fields)` and in all other editions, it matches the default C compiler for the given target for structs, unions, and field-less enums. Enums with fields will be laid out as if they are a union of structs with the corresponding fields.

Using `repr(C)` in editions <= 2024 triggers a lint to use `repr(ordered_fields)` as a future compatibility lint with a machine-applicable fix. If you are using `repr(C)` for FFI, then you may silence this lint. If you are using `repr(C)` for anything else, please switch over to `repr(ordered_fields)` so updating to future editions doesn't change the meaning of your code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too noisy. Most code out there using repr(C) is probably fine - IIUC, if you're not targeting Windows or AIX, maybe definitely fine? - and having a bunch of allow(...) across a bunch of projects seems unfortunate.

Maybe we can either (a) only enable the lint for migration, i.e., the next edition's cargo fix would add allows for you or (b) we find some new name... C2 for the existing repr(C) usage to avoid allows. But (b) also seems too noisy to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could just be an optional edition compatibility lint, so if someone enables e.g. rust_20xx_compatibility it shows up but otherwise not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a) only enable the lint for migration

That was the intention, hence the name edition_2024_repr_c. I'll make this more clear, that this is intended to be a migration lint.

Rustfix would update to #[repr(ordered_fields)] to preserve the current behavior. For the FFI crates, #![allow(edition_2024_repr_c)] at the top of lib.rs would suffice. If you have a mix of FFI and non-FFI uses of repr(C), then you'll have to do the work to figure out which is which, no matter what option is chosen to update repr(C) - even adding repr(C2), since then the FFI use case would need to update all their reprs to repr(C2).

Overall, I think this scheme only significantly burdens those who have a mix of FFI and non-FFI uses of repr(C). But they were going to be burdened no matter what option was chosen.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the new wording/lints too noisy still?

* Any crate that does not have FFI can just apply the autofix
* Any crate which uses `repr(C)` for FFI can ignore the warning crate-wide
* Any crate that mixes both must do extra work to figure out which is which. (This is likely a tiny minority of crates)
* Once the next edition rolls around (2027?), `repr(C)` on the new edition will *not* warn. Instead, the meaning will have changed to mean *only* compatibility with C. The docs should be adjusted to mention this edition wrinkle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe there should be a warning for crates that use repr(C) in >edition 2024 but don't have any extern "<non-Rust>" blocks or functions, since that's a good indicator they're probably not using it for FFI and haven't heard about repr(ordered_fields)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any extern block/function should disable the lint, even extern "Rust" to allow Rust-Rust FFI (with some stable ABI, i.e. a future crABI or the abi_stable crate).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're doing Rust-Rust FFI and expect any kind of ABI stability, you shouldn't be using extern "Rust" (since that's explicitly not stable), you should be using extern "C" (used by abi_stable afaict) or extern "crabi", both of which would disable the lint according to my proposal.

Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two repr(Rust) structs with the same field names, types, and order are not guaranteed to have the same layout and ABI. But two extern "Rust" functions with the same signature, on the same platform and compiler version are guaranteed ABI-compatible. So there is a use-case for combining repr(C) with extern "Rust".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the new lint to all editions, and it should take precedence over the migration lint. This way, we can migrate as many people as possible to repr(ordered_fields). Making it much easier to read code in the future, at the cost of some churn now.

@traviscross traviscross added the P-lang-drag-2 Lang team prioritization drag level 2. label Aug 12, 2025
}
```

Enums with fields will be laid out as if they were a union of structs.
Copy link
Contributor

@kpreid kpreid Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement matches the algorithm currently used for "primitive representations" (repr(u8) etc), not the algorithm currently used for repr(C) or repr(C, uN). If this change is made, then anyone in need of specifically the current repr(C) enum representation cannot migrate to repr(ordered_fields).

The difference between the two algorithms (besides the integer type used for the tag) is that in repr(C), all variants have their first field at the same offset, whereas in repr(uN), each variant’s first field has only the offset required by its type’s alignment, so individual variants may be at a lower offset. The repr(C) enum acts as (and is documented as) a “tagged union” that contains an “untagged union”, whereas repr(uN) does not.

Example of the difference:

#![feature(offset_of_enum)]

#[repr(align(8))]
struct A64(u64);

#[repr(C, i32)]
enum ExampleC {
    Foo(u8),
    Bar(A64),
}
#[repr(i32)]
enum ExamplePrim {
    Foo(u8),
    Bar(A64),
}

fn main() {
    dbg!(
        std::mem::offset_of!(ExampleC, Foo.0), // prints 8
        std::mem::offset_of!(ExamplePrim, Foo.0), // prints 4
    );
}

I think that either repr(ordered_fields, uN) should instead exactly mimic repr(C, uN) in this regard, or — perhaps a better idea — there should be a separate explicitly named representation for this, since it is rare to actually need this layout, and so users merely seeking predictable layout are better served by repr(uN), but they might unthinkingly put repr(ordered_fields) on their enums along with their structs..

Copy link
Author

@RustyYato RustyYato Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I didn't realize there was that difference, thanks for bringing this up. I think repr(ordered_fields) should match what repr(C) does now, to ease migrations (even in combo with the primitive reprs). We could add lints to suggest changing to just a primitive repr if repr(ordered_fields) is used on an enum.

I think making the migration as painless as possible will make it much easier to split up repr(C) up. And we can also add guidance towards other reprs via lints.

I'm open to having the repr be named differently for the different kinds of type. (For example, repr(offset_zero) for unions was considered in another thread)


Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for case 1. They want it to be size 0 (as it currently is).

A tertiary motivation is to make progress on a workaround for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480). This proposal doesn't attempt a complete solution for the bug, but it will be a necessary component of any solution to the bug.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the third motivation, then what is the secondary motivation?^^

Copy link
Author

@RustyYato RustyYato Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it wasn't super clear,

The motivations are

  1. Splitting up two contradictory use-cases for repr(C)
  2. Correct handling of ZSTs on MSVC repr(C) on MSVC targets does not always match MSVC type layout when ZST are involved rust#81996
  3. The MSVC alignment bug MSVC on x86-32 Windows fails to align variables to their required alignment rust#112480

But also, I was trying to hint that this isn't the most important motivation. I'll reword this to be less confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how 1 is a motivation in itself -- 2 and 3 are the motivation to do 1. If it wasn't for 2 and 3, why would we care about 1? I don't think the idealistic "these two cases should not be conflated" really stands as a motivation (for a breaking change!) if there's no downside to conflating them.

Copy link
Author

@RustyYato RustyYato Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've updated the motivation section to remove references to numbers, so the motivation section should flow better.

I don't think the idealistic "these two cases should not be conflated" really stands as a motivation (for a breaking change!) if there's no downside to conflating them.

Yes, I agree that this doesn't stand on it's own. However, it doesn't need to, since it is one (small) motivation among others that make the case for this change. A small motivation to help add context to other more important motivations.

In this case, there is a downside to conflating them (as seen in the other linked issues). I guess I could argue that this is really the only motivation, with the others as evidence that there are downsides to this conflation. But that feels too much like arguing about semantics, which I would like to avoid.

RustyYato and others added 3 commits August 18, 2025 10:19
Co-authored-by: +merlan #flirora <flirora@flirora.xyz>
* update repr(ordered_fields) algorithm for `enum`
* add to and update motivation section
* add potential drawbacks to some lints

By splitting `repr(ordered_fields)` off of `repr(C)`, we can allow `repr(C, packed(N))` to contain over-aligned fields (while making the struct less packed), and (continuing to) disallow `repr(ordered_fields, packed(N))` from containing aligned fields. Thus keeping the Rust-only case free of warts, without compromising on FFI use-cases.

The final motivation is to make progress on a workaround for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably also mention the AIX double issue, which is its own special snowflake. AIUI, fixing it would require reducing align_of::<f64>() from 8 to 4 while also changing the C-compatible layout algorithm, but unlike the MSVC i32 issue it could probably be done at the same time as this RFC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is in the same boat as the MSVC issue, in that reducing the alignment of f64 is a breaking change and would subtly break the layouts of existing types. So would need to be done with some care, and cannot be done with just this RFC.

(and MSVC issue was about u64/i64 iirc)

Co-authored-by: Jacob Lifshay <programmerjake@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. P-lang-drag-2 Lang team prioritization drag level 2. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.