From e0cdce8267e80b10dfd1dd2a7e3d81c026296f14 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Fri, 1 Aug 2025 11:30:17 -0700 Subject: [PATCH 01/28] Create 0000-repr-ordered-fields.md initial RFC --- text/0000-repr-ordered-fields.md | 162 +++++++++++++++++++++++++++++++ 1 file changed, 162 insertions(+) create mode 100644 text/0000-repr-ordered-fields.md diff --git a/text/0000-repr-ordered-fields.md b/text/0000-repr-ordered-fields.md new file mode 100644 index 00000000000..98ed6303e88 --- /dev/null +++ b/text/0000-repr-ordered-fields.md @@ -0,0 +1,162 @@ +- Feature Name: (fill me in with a unique ident, `repr_ordered_fields`) +- Start Date: (fill me in with today's date, 2025-08-01) +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +Introduce a new `repr` (let's call it `repr(ordered_fields)`, but that can be bikeshedded if this RFC get's accepted) which can be applied to `struct`, `enum`, and `union` types which guarantees a simple, and predictable layout. Then provide an initial migration plan to switch users from `repr(C)` to `repr(ordered_fields)`. + +# Motivation +[motivation]: #motivation + +Currently `repr(C)` serves two roles +1. Provide a consistent, cross-platform, predictable layout for a given type +2. Match the target C compiler's struct/union layout algorithm and ABI + +But in some cases, these two cases are in tension due to platform weirdness (even on major platforms like Windows MSVC) +* https://github.com/rust-lang/unsafe-code-guidelines/issues/521 +* https://github.com/rust-lang/rust/issues/81996 + +Providing any fix for case 2 would subtly break any users of case 1, which makes this difficult to fix within a single edition. +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +Introduce a new `repr(ordered_fields)` which can be applied to `struct`, `enum`, and `union`. On all editions, `repr(ordered_fields)` would behave the same as `repr(C)` on edition 2024. (see reference level explanation for details). + +On editions 2024 (maybe <= 2024), any use of `repr(C)` will trigger a new warning, `edition_2024_repr_c` which will be warn by default. +This warning will suggest a machine applicable fix to switch `repr(C)` to `repr(ordered_fields)`, which is a no-op in the current edition, but helps prepare for changes to `repr(C)` early. This gives time for the community to switch over if they need to. + +For the FFI crates, they can safely ignore the warning by applying `#![allow(edition_2024_repr_c)]` to their crate root. +For crates without any FFI, they can simply run the machine applicable fix. +For crates with a mix, they will need to do some work to figure out which is which. But this is unavoidable to solve the stated motivation. + +For example, the warning could look like this: +``` +warning: use of `repr(C)` in type `Foo` + --> src/main.rs:14:10 + | +14 | #[repr(C)] + | ^^^^^^^ help: consider switching to `repr(ordered_fields)` + | struct Foo { + | + = note: `#[warn(edition_2024_repr_c)]` on by default + = note: `repr(C)` is planned to change meaning in the next edition to match the target platform's layout algorithm. This may change the layout of this type on certain platforms. To keep the current layout, switch to `repr(ordred_fields)` +``` + + +On editions > 2024, `repr(ordered_fields)` may differ from `repr(C)`, so that `repr(C)` can match the platform's layout algorithm. + +On all editions, once people have made the switch this will make it easier to tell *why* the `repr` was applied to a given struct. If `repr(C)`, it's about FFI and interop, if `repr(ordered_fields)` then it's for a dependable layout. Unlike today, where `repr(C)` fills both roles. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +This feature only touches `repr(C)`, other reprs are left as is. It introduces exactly one new repr, `repr(ordered_fields)` to take on one of the roles that `repr(C)` use to take. + +`repr(ordered_fields)` will use the same layout algorithm that `repr(C)` currently uses, details can be found in the [reference](https://doc.rust-lang.org/reference/type-layout.html?highlight=repr#reprc-structs). I will give an informal overview here. + +For structs, `repr(ordered_fields)` lays out each fields in memory according to the declaration order of the fields. + +```rust +#[repr(ordered_fields)] +struct Foo { + a: u32, + b: u8, + c: u32, + d: u16, +} +``` +Would be laid out like so (where `.` are padding bytes) +``` +#####...######.. +a b c d +``` + +For unions, each field is laid out at offset 0, and never have niches. + +For enums, the discriminant size is left unspecified (unless another repr specifies it like `repr(ordered_fields, u8)`), but is guaranteed to be stable across rust versions for a given set of variants and fields in each variant. + +Enums are defined as a union of structs, where each struct corresponds to each variant of the enum, with the discriminant is prepended as the first field. + +For example, `Foo` and `Bar` have the same layout in the example below modulo niches. + +```rust +#[repr(ordered_fields, u32)] +enum Foo { + Variant1, + Variant2(u8, u64), + Variant3 { + name: String, + } +} + +#[repr(ordered_fields)] +union Bar { + variant1: BarVariant1, + variant2: BarVariant2, + variant3: BarVariant3, +} + +#[repr(ordered_fields)] +struct BarVariant1 { + discr: u32, +} + +#[repr(ordered_fields)] +struct BarVariant2(u32, u8, u64); + +#[repr(ordered_fields)] +struct BarVariant3 { + discr: u32, + name: String, +} +``` + +Introduce a new `repr(ordered_fields)` which can be applied to `struct`, `enum`, and `union`. On all editions, `repr(ordered_fields)` would behave the same as `repr(C)` on edition 2024. + +On editions > 2024, `repr(ordered_fields)` may differ from `repr(C)`, so that `repr(C)` can match the platform's layout algorithm. For an extreme example, we could stop compiling `repr(C)` for ZST if the target C compiler doesn't allow ZSTs, or we could bump the size to 1 byte if the target C compiler does that by default (this is just an illustrative example, and not endorsed by RFC). + +As mentioned in the guide-level explanation, on edition 2024 (maybe <= 2024), any use of `repr(C)` would trigger a new warn by default diagnostic, `edition_2024_repr_c`. This warning could be phased out after at least two editions have passed. This gives the community enough time to migrate any code over to `repr(ordered_fields)` before the next edition after 2024, but doesn't burden Rust forever. + +The warning should come with a machine-applicable fix to switch `repr(C)` to `repr(ordered_fields)`, and this fix should be part of `cargo fix`. +# Drawbacks +[drawbacks]: #drawbacks + +* This will cause a large amount of churn in the Rust ecosystem +* If we don't end up actually switching `repr(C)` to mean the system layout/ABI, then we will have two identical reprs, which may cause confusion. +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +* `crabi`: http://github.com/rust-lang/rfcs/pull/3470 + * Currently stuck in limbo since it has a much larger scope. doesn't actually serve to give a consistent cross-platform layout, since it defers to `repr(C)` (and it must, for it's stated goals) +* https://internals.rust-lang.org/t/consistent-ordering-of-struct-fileds-across-all-layout-compatible-generics/23247 + * This doesn't give a predictable layout that can be used to match the layouts (or prefixes) of different structs +* https://github.com/rust-lang/rfcs/pull/3718 + * This one is currently stuck due to a larger scope than this RFC +* do nothing + * We keep getting bug reports on Windows (and other platforms), where `repr(C)` doesn't actually match the target C compiler. Or we break a bunch of subtle unsafe code to match the target C compiler. +# Prior art +[prior-art]: #prior-art + +This is discussed in Rationale and Alternatives +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +* The migration plan, as a whole needs to be ironed out + * Currently it is just a sketch, but we need timelines, dates, and guarantees to switch `repr(C)` to match the layout algorithm of the target C compiler. + * Before this RFC is accepted, t-compiler will need to commit to fixing the layout algorithm sometime in the next edition. +* The name of the new repr `repr(ordered_fields)` is a mouthful (intentionally for this RFC), maybe we could pick a better name? This could be done after the RFC is accepted. + * `repr(linear)` + * `repr(ordered)` + * `repr(sequential)` + * something else? + +# Future possibilities +[future-possibilities]: #future-possibilities + +* Add more reprs for each target C compiler, for example `repr(C_gcc)` or `repr(C_msvc)`, etc. + * This would allow a single rust app to target multiple compilers in a robust way, and would make it easier to specify `repr(C)` + * This would also allow fixing code in older editions +* https://internals.rust-lang.org/t/consistent-ordering-of-struct-fileds-across-all-layout-compatible-generics/23247 From 0637bbfb1d27ac9c9a26d78b4e04093292eefc57 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Fri, 1 Aug 2025 11:32:58 -0700 Subject: [PATCH 02/28] Add a link to the zullip thread --- text/0000-repr-ordered-fields.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/0000-repr-ordered-fields.md b/text/0000-repr-ordered-fields.md index 98ed6303e88..2357a2403be 100644 --- a/text/0000-repr-ordered-fields.md +++ b/text/0000-repr-ordered-fields.md @@ -140,7 +140,10 @@ The warning should come with a machine-applicable fix to switch `repr(C)` to `re # Prior art [prior-art]: #prior-art -This is discussed in Rationale and Alternatives +See Rationale and Alternatives as well + +* https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/expand.2Frevise.20repr.28.7BC.2Clinear.2C.2E.2E.2E.7D.29.20for.202024.20edition + # Unresolved questions [unresolved-questions]: #unresolved-questions From 7e4b0c835c300fc993adabaafd4d62baa9ac6bf5 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 09:04:49 -0700 Subject: [PATCH 03/28] Update RFC with PR number --- ...0-repr-ordered-fields.md => 3845-repr-ordered-fields.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-repr-ordered-fields.md => 3845-repr-ordered-fields.md} (97%) diff --git a/text/0000-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md similarity index 97% rename from text/0000-repr-ordered-fields.md rename to text/3845-repr-ordered-fields.md index 2357a2403be..5c145224828 100644 --- a/text/0000-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -1,6 +1,6 @@ -- Feature Name: (fill me in with a unique ident, `repr_ordered_fields`) -- Start Date: (fill me in with today's date, 2025-08-01) -- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Feature Name: `repr_ordered_fields` +- Start Date: 2025-08-05 +- RFC PR: [rust-lang/rfcs#3845](https://github.com/rust-lang/rfcs/pull/3845) - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) # Summary From 61ce0f26031303e103ab4ddc4d9cb7325025793a Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 09:21:43 -0700 Subject: [PATCH 04/28] Grammer fixes --- text/3845-repr-ordered-fields.md | 38 ++++++++++++++++---------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 5c145224828..02f88161d98 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Introduce a new `repr` (let's call it `repr(ordered_fields)`, but that can be bikeshedded if this RFC get's accepted) which can be applied to `struct`, `enum`, and `union` types which guarantees a simple, and predictable layout. Then provide an initial migration plan to switch users from `repr(C)` to `repr(ordered_fields)`. +Introduce a new `repr` (let's call it `repr(ordered_fields)`, but that can be bikeshedded if this RFC is accepted) that can be applied to `struct`, `enum`, and `union` types, which guarantees a simple and predictable layout. Then provide an initial migration plan to switch users from `repr(C)` to `repr(ordered_fields)`. # Motivation [motivation]: #motivation @@ -25,11 +25,11 @@ Providing any fix for case 2 would subtly break any users of case 1, which makes Introduce a new `repr(ordered_fields)` which can be applied to `struct`, `enum`, and `union`. On all editions, `repr(ordered_fields)` would behave the same as `repr(C)` on edition 2024. (see reference level explanation for details). -On editions 2024 (maybe <= 2024), any use of `repr(C)` will trigger a new warning, `edition_2024_repr_c` which will be warn by default. -This warning will suggest a machine applicable fix to switch `repr(C)` to `repr(ordered_fields)`, which is a no-op in the current edition, but helps prepare for changes to `repr(C)` early. This gives time for the community to switch over if they need to. +In editions 2024 (maybe <= 2024), any use of `repr(C)` will trigger a new warning, `edition_2024_repr_c` which will be warn by default. +This warning suggests a machine-applicable fix to switch `repr(C)` to `repr(ordered_fields)`, which is a no-op in the current edition, but helps prepare for changes to `repr(C)` early. This gives time for the community to update their code as needed. For the FFI crates, they can safely ignore the warning by applying `#![allow(edition_2024_repr_c)]` to their crate root. -For crates without any FFI, they can simply run the machine applicable fix. +For crates without any FFI, they can simply run the machine-applicable fix. For crates with a mix, they will need to do some work to figure out which is which. But this is unavoidable to solve the stated motivation. For example, the warning could look like this: @@ -42,22 +42,22 @@ warning: use of `repr(C)` in type `Foo` | struct Foo { | = note: `#[warn(edition_2024_repr_c)]` on by default - = note: `repr(C)` is planned to change meaning in the next edition to match the target platform's layout algorithm. This may change the layout of this type on certain platforms. To keep the current layout, switch to `repr(ordred_fields)` + = note: `repr(C)` is planned to change meaning in the next edition to match the target platform's layout algorithm. This may change the layout of this type on certain platforms. To keep the current layout, switch to `repr(ordered_fields)` ``` On editions > 2024, `repr(ordered_fields)` may differ from `repr(C)`, so that `repr(C)` can match the platform's layout algorithm. -On all editions, once people have made the switch this will make it easier to tell *why* the `repr` was applied to a given struct. If `repr(C)`, it's about FFI and interop, if `repr(ordered_fields)` then it's for a dependable layout. Unlike today, where `repr(C)` fills both roles. +On all editions, once people have made the switch, this will make it easier to tell *why* the `repr` was applied to a given struct. If `repr(C)`, it's about FFI and interop. If `repr(ordered_fields)`, then it's for a dependable layout. This is more clear than today, where `repr(C)` fills both roles. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -This feature only touches `repr(C)`, other reprs are left as is. It introduces exactly one new repr, `repr(ordered_fields)` to take on one of the roles that `repr(C)` use to take. +This feature only touches `repr(C)`, other reprs are left as is. It introduces exactly one new repr, `repr(ordered_fields)`, to take on one of the roles that `repr(C)` used to take. -`repr(ordered_fields)` will use the same layout algorithm that `repr(C)` currently uses, details can be found in the [reference](https://doc.rust-lang.org/reference/type-layout.html?highlight=repr#reprc-structs). I will give an informal overview here. +`repr(ordered_fields)` will use the same layout algorithm that `repr(C)` currently uses. Details can be found in the [reference](https://doc.rust-lang.org/reference/type-layout.html?highlight=repr#reprc-structs). I will give an informal overview here. -For structs, `repr(ordered_fields)` lays out each fields in memory according to the declaration order of the fields. +For structs, `repr(ordered_fields)` lays out each field in memory according to the declaration order of the fields. ```rust #[repr(ordered_fields)] @@ -74,11 +74,11 @@ Would be laid out like so (where `.` are padding bytes) a b c d ``` -For unions, each field is laid out at offset 0, and never have niches. +For unions, each field is laid out at offset 0, and never has niches. -For enums, the discriminant size is left unspecified (unless another repr specifies it like `repr(ordered_fields, u8)`), but is guaranteed to be stable across rust versions for a given set of variants and fields in each variant. +For enums, the discriminant size is left unspecified (unless another repr specifies it like `repr(ordered_fields, u8)`), but is guaranteed to be stable across Rust versions for a given set of variants and fields in each variant. -Enums are defined as a union of structs, where each struct corresponds to each variant of the enum, with the discriminant is prepended as the first field. +Enums are defined as a union of structs, where each struct corresponds to each variant of the enum, with the discriminant prepended as the first field. For example, `Foo` and `Bar` have the same layout in the example below modulo niches. @@ -118,25 +118,25 @@ Introduce a new `repr(ordered_fields)` which can be applied to `struct`, `enum`, On editions > 2024, `repr(ordered_fields)` may differ from `repr(C)`, so that `repr(C)` can match the platform's layout algorithm. For an extreme example, we could stop compiling `repr(C)` for ZST if the target C compiler doesn't allow ZSTs, or we could bump the size to 1 byte if the target C compiler does that by default (this is just an illustrative example, and not endorsed by RFC). -As mentioned in the guide-level explanation, on edition 2024 (maybe <= 2024), any use of `repr(C)` would trigger a new warn by default diagnostic, `edition_2024_repr_c`. This warning could be phased out after at least two editions have passed. This gives the community enough time to migrate any code over to `repr(ordered_fields)` before the next edition after 2024, but doesn't burden Rust forever. +As mentioned in the guide-level explanation, in edition 2024 (maybe <= 2024), any use of `repr(C)` would trigger a new warn by default diagnostic, `edition_2024_repr_c`. This warning could be phased out after at least two editions have passed. This gives the community enough time to migrate any code over to `repr(ordered_fields)` before the next edition after 2024, but doesn't burden Rust forever. The warning should come with a machine-applicable fix to switch `repr(C)` to `repr(ordered_fields)`, and this fix should be part of `cargo fix`. # Drawbacks [drawbacks]: #drawbacks * This will cause a large amount of churn in the Rust ecosystem -* If we don't end up actually switching `repr(C)` to mean the system layout/ABI, then we will have two identical reprs, which may cause confusion. +* If we don't end up switching `repr(C)` to mean the system layout/ABI, then we will have two identical reprs, which may cause confusion. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives * `crabi`: http://github.com/rust-lang/rfcs/pull/3470 - * Currently stuck in limbo since it has a much larger scope. doesn't actually serve to give a consistent cross-platform layout, since it defers to `repr(C)` (and it must, for it's stated goals) + * Currently stuck in limbo since it has a much larger scope. doesn't actually serve to give a consistent cross-platform layout, since it defers to `repr(C)` (and it must, for its stated goals) * https://internals.rust-lang.org/t/consistent-ordering-of-struct-fileds-across-all-layout-compatible-generics/23247 * This doesn't give a predictable layout that can be used to match the layouts (or prefixes) of different structs * https://github.com/rust-lang/rfcs/pull/3718 * This one is currently stuck due to a larger scope than this RFC * do nothing - * We keep getting bug reports on Windows (and other platforms), where `repr(C)` doesn't actually match the target C compiler. Or we break a bunch of subtle unsafe code to match the target C compiler. + * We keep getting bug reports on Windows (and other platforms), where `repr(C)` doesn't actually match the target C compiler, or we break a bunch of subtle unsafe code to match the target C compiler. # Prior art [prior-art]: #prior-art @@ -147,8 +147,8 @@ See Rationale and Alternatives as well # Unresolved questions [unresolved-questions]: #unresolved-questions -* The migration plan, as a whole needs to be ironed out - * Currently it is just a sketch, but we need timelines, dates, and guarantees to switch `repr(C)` to match the layout algorithm of the target C compiler. +* The migration plan, as a whole, needs to be ironed out + * Currently, it is just a sketch, but we need timelines, dates, and guarantees to switch `repr(C)` to match the layout algorithm of the target C compiler. * Before this RFC is accepted, t-compiler will need to commit to fixing the layout algorithm sometime in the next edition. * The name of the new repr `repr(ordered_fields)` is a mouthful (intentionally for this RFC), maybe we could pick a better name? This could be done after the RFC is accepted. * `repr(linear)` @@ -160,6 +160,6 @@ See Rationale and Alternatives as well [future-possibilities]: #future-possibilities * Add more reprs for each target C compiler, for example `repr(C_gcc)` or `repr(C_msvc)`, etc. - * This would allow a single rust app to target multiple compilers in a robust way, and would make it easier to specify `repr(C)` + * This would allow a single Rust app to target multiple compilers robustly, and would make it easier to specify `repr(C)` * This would also allow fixing code in older editions * https://internals.rust-lang.org/t/consistent-ordering-of-struct-fileds-across-all-layout-compatible-generics/23247 From 4fa572e57773c9baf3bd15d854e83b047a73570d Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 12:27:04 -0700 Subject: [PATCH 05/28] Update Motivation to include the exact issue from MSVC --- text/3845-repr-ordered-fields.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 02f88161d98..dbf0e46f91d 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -20,6 +20,17 @@ But in some cases, these two cases are in tension due to platform weirdness (eve * https://github.com/rust-lang/rust/issues/81996 Providing any fix for case 2 would subtly break any users of case 1, which makes this difficult to fix within a single edition. + +As an example of this tension: on Windows MSVC, `repr(C)` doesn't always match what MSVC does for ZST structs (see this [issue](https://github.com/rust-lang/rust/issues/81996) for more details) + +```rust +// should have size 8, but has size 0 +#[repr(C)] +struct SomeFFI([i64; 0]); +``` + +Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for case 1. They want it to be size 0 (as it currently is). + # Guide-level explanation [guide-level-explanation]: #guide-level-explanation From 755644ca339c04c7efada4b002283ac6955ae5bf Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 15:45:13 -0700 Subject: [PATCH 06/28] Rework reference level section add more unresolved questions --- text/3845-repr-ordered-fields.md | 239 ++++++++++++++++++++++++------- 1 file changed, 189 insertions(+), 50 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index dbf0e46f91d..fd6dbc549c3 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -34,16 +34,12 @@ Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for c # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -Introduce a new `repr(ordered_fields)` which can be applied to `struct`, `enum`, and `union`. On all editions, `repr(ordered_fields)` would behave the same as `repr(C)` on edition 2024. (see reference level explanation for details). +`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout. -In editions 2024 (maybe <= 2024), any use of `repr(C)` will trigger a new warning, `edition_2024_repr_c` which will be warn by default. -This warning suggests a machine-applicable fix to switch `repr(C)` to `repr(ordered_fields)`, which is a no-op in the current edition, but helps prepare for changes to `repr(C)` early. This gives time for the community to update their code as needed. +`repr(C)` in edition <= 2024 is an alias for `repr(ordered_fields)` and in all other editions, it matches the default C compiler for the given target for structs, unions, and field-less enums. Enums with fields will be laid out as if they are a union of structs with the corresponding fields. -For the FFI crates, they can safely ignore the warning by applying `#![allow(edition_2024_repr_c)]` to their crate root. -For crates without any FFI, they can simply run the machine-applicable fix. -For crates with a mix, they will need to do some work to figure out which is which. But this is unavoidable to solve the stated motivation. +Using `repr(C)` in editions <= 2024 triggers a lint to use `repr(ordered_fields)` as a future compatibility lint with a machine-applicable fix. If you are using `repr(C)` for FFI, then you may silence this lint. If you are using `repr(C)` for anything else, please switch over to `repr(ordered_fields)` so updating to future editions doesn't change the meaning of your code. -For example, the warning could look like this: ``` warning: use of `repr(C)` in type `Foo` --> src/main.rs:14:10 @@ -56,82 +52,217 @@ warning: use of `repr(C)` in type `Foo` = note: `repr(C)` is planned to change meaning in the next edition to match the target platform's layout algorithm. This may change the layout of this type on certain platforms. To keep the current layout, switch to `repr(ordered_fields)` ``` - -On editions > 2024, `repr(ordered_fields)` may differ from `repr(C)`, so that `repr(C)` can match the platform's layout algorithm. - -On all editions, once people have made the switch, this will make it easier to tell *why* the `repr` was applied to a given struct. If `repr(C)`, it's about FFI and interop. If `repr(ordered_fields)`, then it's for a dependable layout. This is more clear than today, where `repr(C)` fills both roles. - +After enough time has passed, and the community has switched over: +This makes it easier to tell *why* the `repr` was applied to a given struct. If `repr(C)`, it's about FFI and interop. If `repr(ordered_fields)`, then it's for a dependable layout. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -This feature only touches `repr(C)`, other reprs are left as is. It introduces exactly one new repr, `repr(ordered_fields)`, to take on one of the roles that `repr(C)` used to take. +## `repr(C)` -`repr(ordered_fields)` will use the same layout algorithm that `repr(C)` currently uses. Details can be found in the [reference](https://doc.rust-lang.org/reference/type-layout.html?highlight=repr#reprc-structs). I will give an informal overview here. +> The `C` representation is designed for one purpose: creating types that are interoperable with the C Language. +> +> This representation can be applied to structs, unions, and enums. The exception is [zero-variant enums](https://doc.rust-lang.org/stable/reference/items/enumerations.html#zero-variant-enums) for which the `C` representation is an error. +> +> - edited version of the [reference](https://doc.rust-lang.org/stable/reference/type-layout.html#the-c-representation) on `repr(C)` -For structs, `repr(ordered_fields)` lays out each field in memory according to the declaration order of the fields. +The exact algorithm is deferred to whatever the default target C compiler does with default settings (or if applicable, the most commonly used settings). +## `repr(ordered_fields)` + +> The `ordered_fields` representation is designed for one purpose: create types that you can soundly perform operations on that rely on data layout such as reinterpreting values as a different type +> +> This representation can be applied to structs, unions, and enums. +> +> - edited version of the [reference](https://doc.rust-lang.org/stable/reference/type-layout.html#the-c-representation) on `repr(C)` +### struct +Structs are laid out in memory in declaration order. ```rust +// size 16, align 4 #[repr(ordered_fields)] -struct Foo { - a: u32, - b: u8, - c: u32, - d: u16, +struct FooStruct { + a: u8, + b: u32, + c: u16, + d: u32, } ``` -Would be laid out like so (where `.` are padding bytes) + +Would be laid out in memory like so + +``` +a...bbbbcc..dddd ``` -#####...######.. -a b c d +### union +```rust +// size 4, align 4 +#[repr(ordered_fields)] +union FooUnion { + a: u8, + b: u32, + c: u16, + d: u32, +} ``` -For unions, each field is laid out at offset 0, and never has niches. +Would have the same layout as `u32`. +### enum +The enum discriminant will be the smallest signed integer type which can hold all of the discriminant values (unless otherwise specified). The discriminants are assigned such that each variant without an explicit discriminant is exactly one more than the previous variant in declaration order. -For enums, the discriminant size is left unspecified (unless another repr specifies it like `repr(ordered_fields, u8)`), but is guaranteed to be stable across Rust versions for a given set of variants and fields in each variant. +If an enum doesn't have any fields, then it is represented exactly by it's discriminant. +```rust +// discriminant = i16 +// represented as i16 +#[repr(ordered_fields)] +enum FooEnum { + VarA = 1, + VarB, // discriminant = 2 + VarC = 500, + VarD, // discriminant = 501 +} -Enums are defined as a union of structs, where each struct corresponds to each variant of the enum, with the discriminant prepended as the first field. +// discriminant = u16 +// represented as u16 +#[repr(ordered_fields, u16)] +enum FooEnumUnsigned { + VarA = 1, + VarB, // discriminant = 2 + VarC = 500, + VarD, // discriminant = 501 +} +``` -For example, `Foo` and `Bar` have the same layout in the example below modulo niches. +Enums with fields will be laid out as if they were a union of structs. +For example, this would be laid out the same as the union below ```rust -#[repr(ordered_fields, u32)] -enum Foo { - Variant1, - Variant2(u8, u64), - Variant3 { - name: String, - } +#[repr(ordered_fields)] +enum BarEnum { + VarFieldless, + VarTuple(u8, u32), + VarStruct { + a: u16, + b: u32, + }, } +``` +```rust #[repr(ordered_fields)] -union Bar { - variant1: BarVariant1, - variant2: BarVariant2, - variant3: BarVariant3, +union BarUnion { + var1: VarFieldless, + var2: VarTuple, + var3: VarStruct, } #[repr(ordered_fields)] -struct BarVariant1 { - discr: u32, +enum BarDiscriminant { + VarFieldless, + VarTuple, + VarStruct, } #[repr(ordered_fields)] -struct BarVariant2(u32, u8, u64); +struct VarFieldless { + disc: BarDiscriminant, +} #[repr(ordered_fields)] -struct BarVariant3 { - discr: u32, - name: String, -} +struct VarTuple(BarDiscriminant, u8, u32); + +#[repr(ordered_fields)] +struct VarStruct(BarDiscriminant, u16, u32); ``` -Introduce a new `repr(ordered_fields)` which can be applied to `struct`, `enum`, and `union`. On all editions, `repr(ordered_fields)` would behave the same as `repr(C)` on edition 2024. +In Rust, the algorithm for calculating the layout is defined precisely as follows -On editions > 2024, `repr(ordered_fields)` may differ from `repr(C)`, so that `repr(C)` can match the platform's layout algorithm. For an extreme example, we could stop compiling `repr(C)` for ZST if the target C compiler doesn't allow ZSTs, or we could bump the size to 1 byte if the target C compiler does that by default (this is just an illustrative example, and not endorsed by RFC). +```rust +/// Takes in the layout of each field (in declaration order) +/// and returns the offsets of each field, and layout of the entire struct +fn get_layout_for_struct(field_layouts: &[Layout]) -> Result<(Vec, Layout), LayoutError> { + let mut layout = Layout::new::<()>(); + let mut field_offsets = Vec::new(); + + for &field in field_layouts { + let (next_layout, offset) = layout.extend(field)?; + + field_offsets.push(offset); + layout = next_layout; + } + + Ok((field_offsets, layout.pad_to_align())) +} -As mentioned in the guide-level explanation, in edition 2024 (maybe <= 2024), any use of `repr(C)` would trigger a new warn by default diagnostic, `edition_2024_repr_c`. This warning could be phased out after at least two editions have passed. This gives the community enough time to migrate any code over to `repr(ordered_fields)` before the next edition after 2024, but doesn't burden Rust forever. +fn layout_max(a: Layout, b: Layout) -> Result { + Layout::from_size_align( + a.size().max(b.size()), + a.align().max(b.align()), + ) +} + +/// Takes in the layout of each field (in declaration order) +/// and returns the layout of the entire union +/// NOTE: all fields of the union are located at offset 0 +fn get_layout_for_union(field_layouts: &[Layout]) -> Result { + let mut layout = Layout::new::<()>(); + + for &field in field_layouts { + layout = layout_max(layout, field)?; + } + + Ok(layout.pad_to_align()) +} + +/// Takes in the layout of each variant (and their fields) (in declaration order) +/// and returns the offsets of all fields of the enum, and the layout of the entire enum +/// NOTE: all fields of the enum discriminant is always at offset 0 +fn get_layout_for_enum( + // the discriminants may be negative for some enums + // or u128::MAX for some enums, so there is no one primitive integer type which works. So BigInteger + discriminants: &[BigInteger], + variant_layouts: &[&[Layout]] +) -> Result<(Vec>, Layout), LayoutError> { + assert_eq!(discriminants.len(), variant_layouts.len()); + + let mut layout = Layout::new::<()>(); + let mut variant_field_offsets = Vec::new(); + + let mut variant_with_disc = Vec::new(); + // gives the smallest integer type which can represent the variants and the specified discriminants + let disc_layout = get_layout_for_discriminant(discriminants); + // ensure that the discriminant is the first field + variant_with_disc.push(disc_layout); + + for &variant in variant_layouts { + variant_with_disc.truncate(1); + // put all other fields of the variant after this one + variant_with_disc.extend_from_slice(variant); + let (mut offsets, variant_layout) = get_layout_for_struct(&variant_with_disc)?; + // remove the discriminant so the caller only gets the fields they provided in order + offsets.remove(0); + + variant_field_offsets.push(offsets); + layout = layout_max(layout, variant_layout)?; + } + + Ok((variant_field_offsets, layout)) +} +``` +### Migration to `repr(ordered_fields)` + +The migration will be handled as follows: +* after `repr(ordered_fields)` is implemented + * add a future compatibility warning for `repr(C)` in all current editions + * at this point both `repr(ordered_fields)` and `repr(C)` will have identical behavior + * the warning will come with a machine-applicable fix + * Any crate which does no FFI can just apply the autofix + * Any crate which uses `repr(C)` for FFI can ignore the warning crate-wide + * Any crate which mixes both must do extra work to figure out which is which. (This is likely a tiny minority of crate) +* Once the next edition rolls around (2027?), `repr(C)` on the new edition will *not* warn. Instead the meaning will have changed to mean *only* compatibility with C. The docs should be adjusted to mention this edition wrinkle. + * The warning for previous editions will continue to be in effect +* In some future edition (2033+), when it is deemed safe enough, the future compatibility warnings may be removed in editions <= 2024 + * This should have given plenty of time for anyone who was going to update their code to do so. And doesn't burden the language indefinitely + * This part isn't strictly necessary, and can be removed if needed -The warning should come with a machine-applicable fix to switch `repr(C)` to `repr(ordered_fields)`, and this fix should be part of `cargo fix`. # Drawbacks [drawbacks]: #drawbacks @@ -165,8 +296,16 @@ See Rationale and Alternatives as well * `repr(linear)` * `repr(ordered)` * `repr(sequential)` + * `repr(consistent)` * something else? - +* Is the ABI of `repr(ordered_fields)` specified (making it safe for FFI)? Or not? +* Should unions expose some niches? + * For example, if all variants of the union are structs which have a common prefix, then any niches of that common prefix could be exposed (i.e. in the enum case, making union of structs behave more like an enum). + * This must be answered before stabilization, as it is set in stone after that +* Should this `repr` be versioned? + * This way we can evolve the repr (for example, by adding new niches) +* Should we change the meaning of `repr(C)` in editions <= 2024 after we have reached edition 2033? Yes, it's a breaking change, but at that point it will likely only be breaking code no one uses. + * Leaning towards no # Future possibilities [future-possibilities]: #future-possibilities From 0d0a79b9a38e6841f672d3512253fcd214fa598e Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 15:49:42 -0700 Subject: [PATCH 07/28] Add description for union layout Rework struct layout description. --- text/3845-repr-ordered-fields.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index fd6dbc549c3..84e8d7ee0cf 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -74,7 +74,8 @@ The exact algorithm is deferred to whatever the default target C compiler does w > > - edited version of the [reference](https://doc.rust-lang.org/stable/reference/type-layout.html#the-c-representation) on `repr(C)` ### struct -Structs are laid out in memory in declaration order. +Structs are laid out in memory in declaration order, with padding bytes added as necessary to preserve alignment. +And their alignment would be the same as their most aligned field. ```rust // size 16, align 4 @@ -93,6 +94,8 @@ Would be laid out in memory like so a...bbbbcc..dddd ``` ### union +Unions would be laid out with the same size as their largest field, and the same alignment as their most aligned field. + ```rust // size 4, align 4 #[repr(ordered_fields)] From 4f4bf18d5b7a8ef337950b2d6ff80200bc19e781 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 15:50:43 -0700 Subject: [PATCH 08/28] Tabs -> Spaces --- text/3845-repr-ordered-fields.md | 202 +++++++++++++++---------------- 1 file changed, 101 insertions(+), 101 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 84e8d7ee0cf..c009ecbee24 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -81,10 +81,10 @@ And their alignment would be the same as their most aligned field. // size 16, align 4 #[repr(ordered_fields)] struct FooStruct { - a: u8, - b: u32, - c: u16, - d: u32, + a: u8, + b: u32, + c: u16, + d: u32, } ``` @@ -100,10 +100,10 @@ Unions would be laid out with the same size as their largest field, and the same // size 4, align 4 #[repr(ordered_fields)] union FooUnion { - a: u8, - b: u32, - c: u16, - d: u32, + a: u8, + b: u32, + c: u16, + d: u32, } ``` @@ -117,20 +117,20 @@ If an enum doesn't have any fields, then it is represented exactly by it's discr // represented as i16 #[repr(ordered_fields)] enum FooEnum { - VarA = 1, - VarB, // discriminant = 2 - VarC = 500, - VarD, // discriminant = 501 + VarA = 1, + VarB, // discriminant = 2 + VarC = 500, + VarD, // discriminant = 501 } // discriminant = u16 // represented as u16 #[repr(ordered_fields, u16)] enum FooEnumUnsigned { - VarA = 1, - VarB, // discriminant = 2 - VarC = 500, - VarD, // discriminant = 501 + VarA = 1, + VarB, // discriminant = 2 + VarC = 500, + VarD, // discriminant = 501 } ``` @@ -140,33 +140,33 @@ For example, this would be laid out the same as the union below ```rust #[repr(ordered_fields)] enum BarEnum { - VarFieldless, - VarTuple(u8, u32), - VarStruct { - a: u16, - b: u32, - }, + VarFieldless, + VarTuple(u8, u32), + VarStruct { + a: u16, + b: u32, + }, } ``` ```rust #[repr(ordered_fields)] union BarUnion { - var1: VarFieldless, - var2: VarTuple, - var3: VarStruct, + var1: VarFieldless, + var2: VarTuple, + var3: VarStruct, } #[repr(ordered_fields)] enum BarDiscriminant { - VarFieldless, - VarTuple, - VarStruct, + VarFieldless, + VarTuple, + VarStruct, } #[repr(ordered_fields)] struct VarFieldless { - disc: BarDiscriminant, + disc: BarDiscriminant, } #[repr(ordered_fields)] @@ -182,89 +182,89 @@ In Rust, the algorithm for calculating the layout is defined precisely as follow /// Takes in the layout of each field (in declaration order) /// and returns the offsets of each field, and layout of the entire struct fn get_layout_for_struct(field_layouts: &[Layout]) -> Result<(Vec, Layout), LayoutError> { - let mut layout = Layout::new::<()>(); - let mut field_offsets = Vec::new(); - - for &field in field_layouts { - let (next_layout, offset) = layout.extend(field)?; - - field_offsets.push(offset); - layout = next_layout; - } - - Ok((field_offsets, layout.pad_to_align())) + let mut layout = Layout::new::<()>(); + let mut field_offsets = Vec::new(); + + for &field in field_layouts { + let (next_layout, offset) = layout.extend(field)?; + + field_offsets.push(offset); + layout = next_layout; + } + + Ok((field_offsets, layout.pad_to_align())) } fn layout_max(a: Layout, b: Layout) -> Result { - Layout::from_size_align( - a.size().max(b.size()), - a.align().max(b.align()), - ) + Layout::from_size_align( + a.size().max(b.size()), + a.align().max(b.align()), + ) } /// Takes in the layout of each field (in declaration order) /// and returns the layout of the entire union /// NOTE: all fields of the union are located at offset 0 fn get_layout_for_union(field_layouts: &[Layout]) -> Result { - let mut layout = Layout::new::<()>(); - - for &field in field_layouts { - layout = layout_max(layout, field)?; - } - - Ok(layout.pad_to_align()) + let mut layout = Layout::new::<()>(); + + for &field in field_layouts { + layout = layout_max(layout, field)?; + } + + Ok(layout.pad_to_align()) } /// Takes in the layout of each variant (and their fields) (in declaration order) /// and returns the offsets of all fields of the enum, and the layout of the entire enum /// NOTE: all fields of the enum discriminant is always at offset 0 fn get_layout_for_enum( - // the discriminants may be negative for some enums - // or u128::MAX for some enums, so there is no one primitive integer type which works. So BigInteger - discriminants: &[BigInteger], - variant_layouts: &[&[Layout]] + // the discriminants may be negative for some enums + // or u128::MAX for some enums, so there is no one primitive integer type which works. So BigInteger + discriminants: &[BigInteger], + variant_layouts: &[&[Layout]] ) -> Result<(Vec>, Layout), LayoutError> { - assert_eq!(discriminants.len(), variant_layouts.len()); - + assert_eq!(discriminants.len(), variant_layouts.len()); + let mut layout = Layout::new::<()>(); - let mut variant_field_offsets = Vec::new(); - - let mut variant_with_disc = Vec::new(); - // gives the smallest integer type which can represent the variants and the specified discriminants + let mut variant_field_offsets = Vec::new(); + + let mut variant_with_disc = Vec::new(); + // gives the smallest integer type which can represent the variants and the specified discriminants let disc_layout = get_layout_for_discriminant(discriminants); - // ensure that the discriminant is the first field + // ensure that the discriminant is the first field variant_with_disc.push(disc_layout); - for &variant in variant_layouts { - variant_with_disc.truncate(1); - // put all other fields of the variant after this one - variant_with_disc.extend_from_slice(variant); - let (mut offsets, variant_layout) = get_layout_for_struct(&variant_with_disc)?; - // remove the discriminant so the caller only gets the fields they provided in order - offsets.remove(0); - - variant_field_offsets.push(offsets); - layout = layout_max(layout, variant_layout)?; - } - - Ok((variant_field_offsets, layout)) + for &variant in variant_layouts { + variant_with_disc.truncate(1); + // put all other fields of the variant after this one + variant_with_disc.extend_from_slice(variant); + let (mut offsets, variant_layout) = get_layout_for_struct(&variant_with_disc)?; + // remove the discriminant so the caller only gets the fields they provided in order + offsets.remove(0); + + variant_field_offsets.push(offsets); + layout = layout_max(layout, variant_layout)?; + } + + Ok((variant_field_offsets, layout)) } ``` ### Migration to `repr(ordered_fields)` The migration will be handled as follows: * after `repr(ordered_fields)` is implemented - * add a future compatibility warning for `repr(C)` in all current editions - * at this point both `repr(ordered_fields)` and `repr(C)` will have identical behavior - * the warning will come with a machine-applicable fix - * Any crate which does no FFI can just apply the autofix - * Any crate which uses `repr(C)` for FFI can ignore the warning crate-wide - * Any crate which mixes both must do extra work to figure out which is which. (This is likely a tiny minority of crate) + * add a future compatibility warning for `repr(C)` in all current editions + * at this point both `repr(ordered_fields)` and `repr(C)` will have identical behavior + * the warning will come with a machine-applicable fix + * Any crate which does no FFI can just apply the autofix + * Any crate which uses `repr(C)` for FFI can ignore the warning crate-wide + * Any crate which mixes both must do extra work to figure out which is which. (This is likely a tiny minority of crate) * Once the next edition rolls around (2027?), `repr(C)` on the new edition will *not* warn. Instead the meaning will have changed to mean *only* compatibility with C. The docs should be adjusted to mention this edition wrinkle. - * The warning for previous editions will continue to be in effect + * The warning for previous editions will continue to be in effect * In some future edition (2033+), when it is deemed safe enough, the future compatibility warnings may be removed in editions <= 2024 - * This should have given plenty of time for anyone who was going to update their code to do so. And doesn't burden the language indefinitely - * This part isn't strictly necessary, and can be removed if needed + * This should have given plenty of time for anyone who was going to update their code to do so. And doesn't burden the language indefinitely + * This part isn't strictly necessary, and can be removed if needed # Drawbacks [drawbacks]: #drawbacks @@ -275,13 +275,13 @@ The migration will be handled as follows: [rationale-and-alternatives]: #rationale-and-alternatives * `crabi`: http://github.com/rust-lang/rfcs/pull/3470 - * Currently stuck in limbo since it has a much larger scope. doesn't actually serve to give a consistent cross-platform layout, since it defers to `repr(C)` (and it must, for its stated goals) + * Currently stuck in limbo since it has a much larger scope. doesn't actually serve to give a consistent cross-platform layout, since it defers to `repr(C)` (and it must, for its stated goals) * https://internals.rust-lang.org/t/consistent-ordering-of-struct-fileds-across-all-layout-compatible-generics/23247 - * This doesn't give a predictable layout that can be used to match the layouts (or prefixes) of different structs + * This doesn't give a predictable layout that can be used to match the layouts (or prefixes) of different structs * https://github.com/rust-lang/rfcs/pull/3718 - * This one is currently stuck due to a larger scope than this RFC + * This one is currently stuck due to a larger scope than this RFC * do nothing - * We keep getting bug reports on Windows (and other platforms), where `repr(C)` doesn't actually match the target C compiler, or we break a bunch of subtle unsafe code to match the target C compiler. + * We keep getting bug reports on Windows (and other platforms), where `repr(C)` doesn't actually match the target C compiler, or we break a bunch of subtle unsafe code to match the target C compiler. # Prior art [prior-art]: #prior-art @@ -293,26 +293,26 @@ See Rationale and Alternatives as well [unresolved-questions]: #unresolved-questions * The migration plan, as a whole, needs to be ironed out - * Currently, it is just a sketch, but we need timelines, dates, and guarantees to switch `repr(C)` to match the layout algorithm of the target C compiler. - * Before this RFC is accepted, t-compiler will need to commit to fixing the layout algorithm sometime in the next edition. + * Currently, it is just a sketch, but we need timelines, dates, and guarantees to switch `repr(C)` to match the layout algorithm of the target C compiler. + * Before this RFC is accepted, t-compiler will need to commit to fixing the layout algorithm sometime in the next edition. * The name of the new repr `repr(ordered_fields)` is a mouthful (intentionally for this RFC), maybe we could pick a better name? This could be done after the RFC is accepted. - * `repr(linear)` - * `repr(ordered)` - * `repr(sequential)` - * `repr(consistent)` - * something else? + * `repr(linear)` + * `repr(ordered)` + * `repr(sequential)` + * `repr(consistent)` + * something else? * Is the ABI of `repr(ordered_fields)` specified (making it safe for FFI)? Or not? * Should unions expose some niches? - * For example, if all variants of the union are structs which have a common prefix, then any niches of that common prefix could be exposed (i.e. in the enum case, making union of structs behave more like an enum). - * This must be answered before stabilization, as it is set in stone after that + * For example, if all variants of the union are structs which have a common prefix, then any niches of that common prefix could be exposed (i.e. in the enum case, making union of structs behave more like an enum). + * This must be answered before stabilization, as it is set in stone after that * Should this `repr` be versioned? - * This way we can evolve the repr (for example, by adding new niches) + * This way we can evolve the repr (for example, by adding new niches) * Should we change the meaning of `repr(C)` in editions <= 2024 after we have reached edition 2033? Yes, it's a breaking change, but at that point it will likely only be breaking code no one uses. - * Leaning towards no + * Leaning towards no # Future possibilities [future-possibilities]: #future-possibilities * Add more reprs for each target C compiler, for example `repr(C_gcc)` or `repr(C_msvc)`, etc. - * This would allow a single Rust app to target multiple compilers robustly, and would make it easier to specify `repr(C)` - * This would also allow fixing code in older editions + * This would allow a single Rust app to target multiple compilers robustly, and would make it easier to specify `repr(C)` + * This would also allow fixing code in older editions * https://internals.rust-lang.org/t/consistent-ordering-of-struct-fileds-across-all-layout-compatible-generics/23247 From 063af084d8cf1eb52eaac879d0a09ffa0766a68d Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 16:18:40 -0700 Subject: [PATCH 09/28] Apply suggestions from code review Co-authored-by: Jacob Lifshay --- text/3845-repr-ordered-fields.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index c009ecbee24..7920c6c6e0b 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -75,7 +75,7 @@ The exact algorithm is deferred to whatever the default target C compiler does w > - edited version of the [reference](https://doc.rust-lang.org/stable/reference/type-layout.html#the-c-representation) on `repr(C)` ### struct Structs are laid out in memory in declaration order, with padding bytes added as necessary to preserve alignment. -And their alignment would be the same as their most aligned field. +And their alignment is the same as their most aligned field. ```rust // size 16, align 4 @@ -107,9 +107,9 @@ union FooUnion { } ``` -Would have the same layout as `u32`. +`FooUnion` has the same layout as `u32`, since `u32` has both the biggest size and alignment. ### enum -The enum discriminant will be the smallest signed integer type which can hold all of the discriminant values (unless otherwise specified). The discriminants are assigned such that each variant without an explicit discriminant is exactly one more than the previous variant in declaration order. +The enum discriminant is the smallest signed integer type which can hold all of the discriminant values (unless otherwise specified). The discriminants are assigned such that each variant without an explicit discriminant is exactly one more than the previous variant in declaration order. If an enum doesn't have any fields, then it is represented exactly by it's discriminant. ```rust @@ -176,11 +176,11 @@ struct VarTuple(BarDiscriminant, u8, u32); struct VarStruct(BarDiscriminant, u16, u32); ``` -In Rust, the algorithm for calculating the layout is defined precisely as follows +In Rust, the algorithm for calculating the layout is defined precisely as follows: ```rust /// Takes in the layout of each field (in declaration order) -/// and returns the offsets of each field, and layout of the entire struct +/// and returns the offsets of each field, and the layout of the entire struct fn get_layout_for_struct(field_layouts: &[Layout]) -> Result<(Vec, Layout), LayoutError> { let mut layout = Layout::new::<()>(); let mut field_offsets = Vec::new(); @@ -217,7 +217,7 @@ fn get_layout_for_union(field_layouts: &[Layout]) -> Result /// Takes in the layout of each variant (and their fields) (in declaration order) /// and returns the offsets of all fields of the enum, and the layout of the entire enum -/// NOTE: all fields of the enum discriminant is always at offset 0 +/// NOTE: the enum discriminant is always at offset 0 fn get_layout_for_enum( // the discriminants may be negative for some enums // or u128::MAX for some enums, so there is no one primitive integer type which works. So BigInteger From 0c0e429cd3c2fc244804a4bc4d8e0d2c5c178f4a Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 16:31:12 -0700 Subject: [PATCH 10/28] discriminant -> tag --- text/3845-repr-ordered-fields.md | 40 ++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 17 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 7920c6c6e0b..dac652cfd5f 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -34,7 +34,7 @@ Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for c # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout. +`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in-memory layout. `repr(C)` in edition <= 2024 is an alias for `repr(ordered_fields)` and in all other editions, it matches the default C compiler for the given target for structs, unions, and field-less enums. Enums with fields will be laid out as if they are a union of structs with the corresponding fields. @@ -109,11 +109,11 @@ union FooUnion { `FooUnion` has the same layout as `u32`, since `u32` has both the biggest size and alignment. ### enum -The enum discriminant is the smallest signed integer type which can hold all of the discriminant values (unless otherwise specified). The discriminants are assigned such that each variant without an explicit discriminant is exactly one more than the previous variant in declaration order. +The enum's tag is the smallest signed integer type which can hold all of the discriminant values (unless otherwise specified). The discriminants are assigned such that each variant without an explicit discriminant is exactly one more than the previous variant in declaration order. If an enum doesn't have any fields, then it is represented exactly by it's discriminant. ```rust -// discriminant = i16 +// tag = i16 // represented as i16 #[repr(ordered_fields)] enum FooEnum { @@ -123,7 +123,7 @@ enum FooEnum { VarD, // discriminant = 501 } -// discriminant = u16 +// tag = u16 // represented as u16 #[repr(ordered_fields, u16)] enum FooEnumUnsigned { @@ -158,7 +158,7 @@ union BarUnion { } #[repr(ordered_fields)] -enum BarDiscriminant { +enum BarTag { VarFieldless, VarTuple, VarStruct, @@ -166,14 +166,18 @@ enum BarDiscriminant { #[repr(ordered_fields)] struct VarFieldless { - disc: BarDiscriminant, + tag: BarTag, } #[repr(ordered_fields)] -struct VarTuple(BarDiscriminant, u8, u32); +struct VarTuple(BarTag, u8, u32); #[repr(ordered_fields)] -struct VarStruct(BarDiscriminant, u16, u32); +struct VarStruct { + tag: BarTag, + a: u16, + b: u32 +} ``` In Rust, the algorithm for calculating the layout is defined precisely as follows: @@ -217,7 +221,7 @@ fn get_layout_for_union(field_layouts: &[Layout]) -> Result /// Takes in the layout of each variant (and their fields) (in declaration order) /// and returns the offsets of all fields of the enum, and the layout of the entire enum -/// NOTE: the enum discriminant is always at offset 0 +/// NOTE: the enum tag is always at offset 0 fn get_layout_for_enum( // the discriminants may be negative for some enums // or u128::MAX for some enums, so there is no one primitive integer type which works. So BigInteger @@ -229,18 +233,18 @@ fn get_layout_for_enum( let mut layout = Layout::new::<()>(); let mut variant_field_offsets = Vec::new(); - let mut variant_with_disc = Vec::new(); + let mut variant_with_tag = Vec::new(); // gives the smallest integer type which can represent the variants and the specified discriminants - let disc_layout = get_layout_for_discriminant(discriminants); - // ensure that the discriminant is the first field - variant_with_disc.push(disc_layout); + let tag_layout = get_layout_for_tag(discriminants); + // ensure that the tag is the first field + variant_with_tag.push(tag_layout); for &variant in variant_layouts { - variant_with_disc.truncate(1); + variant_with_tag.truncate(1); // put all other fields of the variant after this one - variant_with_disc.extend_from_slice(variant); - let (mut offsets, variant_layout) = get_layout_for_struct(&variant_with_disc)?; - // remove the discriminant so the caller only gets the fields they provided in order + variant_with_tag.extend_from_slice(variant); + let (mut offsets, variant_layout) = get_layout_for_struct(&variant_with_tag)?; + // remove the tag so the caller only gets the fields they provided in order offsets.remove(0); variant_field_offsets.push(offsets); @@ -309,6 +313,8 @@ See Rationale and Alternatives as well * This way we can evolve the repr (for example, by adding new niches) * Should we change the meaning of `repr(C)` in editions <= 2024 after we have reached edition 2033? Yes, it's a breaking change, but at that point it will likely only be breaking code no one uses. * Leaning towards no +* Should we warn on `repr(ordered_fields)` when explicit tag type is specified (i.e. no `repr(u8)`/`repr(i32)`) + * Since it's likely they didn't want the same tag type as `C`, and wanted the smallest possible tag type. # Future possibilities [future-possibilities]: #future-possibilities From 9e316ff9367589fece8099ce3e97926da4a519d5 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 16:33:24 -0700 Subject: [PATCH 11/28] Qualify alignment assumptions --- text/3845-repr-ordered-fields.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index dac652cfd5f..c9d7c3b6764 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -78,6 +78,7 @@ Structs are laid out in memory in declaration order, with padding bytes added as And their alignment is the same as their most aligned field. ```rust +// assuming that u32 is aligned to 4 bytes // size 16, align 4 #[repr(ordered_fields)] struct FooStruct { @@ -97,6 +98,7 @@ a...bbbbcc..dddd Unions would be laid out with the same size as their largest field, and the same alignment as their most aligned field. ```rust +// assuming that u32 is aligned to 4 bytes // size 4, align 4 #[repr(ordered_fields)] union FooUnion { From a1700b66bb549365fb414378ec40507ea71ff9b1 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Tue, 5 Aug 2025 18:58:03 -0700 Subject: [PATCH 12/28] oops, fix typo --- text/3845-repr-ordered-fields.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index c9d7c3b6764..b1a680341c6 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -315,7 +315,7 @@ See Rationale and Alternatives as well * This way we can evolve the repr (for example, by adding new niches) * Should we change the meaning of `repr(C)` in editions <= 2024 after we have reached edition 2033? Yes, it's a breaking change, but at that point it will likely only be breaking code no one uses. * Leaning towards no -* Should we warn on `repr(ordered_fields)` when explicit tag type is specified (i.e. no `repr(u8)`/`repr(i32)`) +* Should we warn on `repr(ordered_fields)` when explicit tag type is missing (i.e. no `repr(u8)`/`repr(i32)`) * Since it's likely they didn't want the same tag type as `C`, and wanted the smallest possible tag type. # Future possibilities [future-possibilities]: #future-possibilities From d75a497c82a6d520d3c2328cb42bda977d559b60 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Wed, 6 Aug 2025 15:16:17 -0700 Subject: [PATCH 13/28] add `repr(declaration_order)` as a potential spelling --- text/3845-repr-ordered-fields.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index b1a680341c6..0750369eab1 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -306,6 +306,7 @@ See Rationale and Alternatives as well * `repr(ordered)` * `repr(sequential)` * `repr(consistent)` + * `repr(declaration_order)` * something else? * Is the ABI of `repr(ordered_fields)` specified (making it safe for FFI)? Or not? * Should unions expose some niches? From 6526f5ac7d81bb66fc197ce1e3bddc5285e81c8d Mon Sep 17 00:00:00 2001 From: RustyYato Date: Wed, 6 Aug 2025 15:28:14 -0700 Subject: [PATCH 14/28] Switch enum tag type to defer to `repr(C)` tag type --- text/3845-repr-ordered-fields.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 0750369eab1..b00ba100d1b 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -111,13 +111,14 @@ union FooUnion { `FooUnion` has the same layout as `u32`, since `u32` has both the biggest size and alignment. ### enum -The enum's tag is the smallest signed integer type which can hold all of the discriminant values (unless otherwise specified). The discriminants are assigned such that each variant without an explicit discriminant is exactly one more than the previous variant in declaration order. +The enum's tag type is same type that is used for `repr(C)` in edition <= 2024, and the discriminants is assigned the same was as `repr(C)` (in edition <= 2024). This means the discriminants are assigned such that each variant without an explicit discriminant is exactly one more than the previous variant in declaration order. +This does mean that the tag type will be platform specific. To alleviate this concern, using `repr(ordered_fields)` on an enum without an explicit `repr(uN)`/`repr(iN)` will trigger a warning. This warning should suggest the smallest integer type which can hold the discriminant values (preferring signed integers to break ties). If an enum doesn't have any fields, then it is represented exactly by it's discriminant. ```rust // tag = i16 // represented as i16 -#[repr(ordered_fields)] +#[repr(ordered_fields, i16)] enum FooEnum { VarA = 1, VarB, // discriminant = 2 @@ -140,7 +141,7 @@ Enums with fields will be laid out as if they were a union of structs. For example, this would be laid out the same as the union below ```rust -#[repr(ordered_fields)] +#[repr(ordered_fields, i8)] enum BarEnum { VarFieldless, VarTuple(u8, u32), @@ -159,7 +160,7 @@ union BarUnion { var3: VarStruct, } -#[repr(ordered_fields)] +#[repr(ordered_fields, i8)] enum BarTag { VarFieldless, VarTuple, From ed96c1b4cb44650f971abd5127bcfd277c02721c Mon Sep 17 00:00:00 2001 From: RustyYato Date: Thu, 7 Aug 2025 08:16:12 -0700 Subject: [PATCH 15/28] Rework edition_2024_repr_c warning from future compat to edition warning --- text/3845-repr-ordered-fields.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index b00ba100d1b..c004d0e2ca0 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -38,7 +38,7 @@ Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for c `repr(C)` in edition <= 2024 is an alias for `repr(ordered_fields)` and in all other editions, it matches the default C compiler for the given target for structs, unions, and field-less enums. Enums with fields will be laid out as if they are a union of structs with the corresponding fields. -Using `repr(C)` in editions <= 2024 triggers a lint to use `repr(ordered_fields)` as a future compatibility lint with a machine-applicable fix. If you are using `repr(C)` for FFI, then you may silence this lint. If you are using `repr(C)` for anything else, please switch over to `repr(ordered_fields)` so updating to future editions doesn't change the meaning of your code. +Using `repr(C)` in editions <= 2024 triggers a lint to use `repr(ordered_fields)` as an optional edition migration lint with a machine-applicable fix. If you are using `repr(C)` for FFI, then you may silence this lint. If you are using `repr(C)` for anything else, please switch over to `repr(ordered_fields)` so updating to future editions doesn't change the meaning of your code. ``` warning: use of `repr(C)` in type `Foo` @@ -112,7 +112,7 @@ union FooUnion { `FooUnion` has the same layout as `u32`, since `u32` has both the biggest size and alignment. ### enum The enum's tag type is same type that is used for `repr(C)` in edition <= 2024, and the discriminants is assigned the same was as `repr(C)` (in edition <= 2024). This means the discriminants are assigned such that each variant without an explicit discriminant is exactly one more than the previous variant in declaration order. -This does mean that the tag type will be platform specific. To alleviate this concern, using `repr(ordered_fields)` on an enum without an explicit `repr(uN)`/`repr(iN)` will trigger a warning. This warning should suggest the smallest integer type which can hold the discriminant values (preferring signed integers to break ties). +This does mean that the tag type will be platform specific. To alleviate this concern, using `repr(ordered_fields)` on an enum without an explicit `repr(uN)`/`repr(iN)` will trigger a warning (name TBD). This warning should suggest the smallest integer type which can hold the discriminant values (preferring signed integers to break ties). If an enum doesn't have any fields, then it is represented exactly by it's discriminant. ```rust @@ -261,23 +261,23 @@ fn get_layout_for_enum( The migration will be handled as follows: * after `repr(ordered_fields)` is implemented - * add a future compatibility warning for `repr(C)` in all current editions + * add an optional edition migration warning for `repr(C)` + * this warning should be advertised publicly (maybe on the Rust Blog?), so that as many people use it. Since even if you are staying on edition <= 2024, it is helpful to switch to `repr(ordered_fields)` to make your intentions clearer * at this point both `repr(ordered_fields)` and `repr(C)` will have identical behavior * the warning will come with a machine-applicable fix - * Any crate which does no FFI can just apply the autofix + * Any crate that does not have FFI can just apply the autofix * Any crate which uses `repr(C)` for FFI can ignore the warning crate-wide - * Any crate which mixes both must do extra work to figure out which is which. (This is likely a tiny minority of crate) -* Once the next edition rolls around (2027?), `repr(C)` on the new edition will *not* warn. Instead the meaning will have changed to mean *only* compatibility with C. The docs should be adjusted to mention this edition wrinkle. + * Any crate that mixes both must do extra work to figure out which is which. (This is likely a tiny minority of crates) +* Once the next edition rolls around (2027?), `repr(C)` on the new edition will *not* warn. Instead, the meaning will have changed to mean *only* compatibility with C. The docs should be adjusted to mention this edition wrinkle. * The warning for previous editions will continue to be in effect -* In some future edition (2033+), when it is deemed safe enough, the future compatibility warnings may be removed in editions <= 2024 - * This should have given plenty of time for anyone who was going to update their code to do so. And doesn't burden the language indefinitely - * This part isn't strictly necessary, and can be removed if needed # Drawbacks [drawbacks]: #drawbacks * This will cause a large amount of churn in the Rust ecosystem + * This is only necessary for those who are updating to the new edition. Which is as little churn as we can make it * If we don't end up switching `repr(C)` to mean the system layout/ABI, then we will have two identical reprs, which may cause confusion. + # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives From 01cfb408ef8f05b61831c1e51ae176e47f02e5a9 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Thu, 7 Aug 2025 08:19:46 -0700 Subject: [PATCH 16/28] Add lints to summary section --- text/3845-repr-ordered-fields.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index c004d0e2ca0..3db21c9dfe6 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -6,7 +6,11 @@ # Summary [summary]: #summary -Introduce a new `repr` (let's call it `repr(ordered_fields)`, but that can be bikeshedded if this RFC is accepted) that can be applied to `struct`, `enum`, and `union` types, which guarantees a simple and predictable layout. Then provide an initial migration plan to switch users from `repr(C)` to `repr(ordered_fields)`. +Introduce a new `repr` (let's call it `repr(ordered_fields)`, but that can be bikeshedded if this RFC is accepted) that can be applied to `struct`, `enum`, and `union` types, which guarantees a simple and predictable layout. Then provide an initial migration plan to switch users from `repr(C)` to `repr(ordered_fields)`. This allows restricting the meaning of `repr(C)` to just serve the FFI use-case. + +Introduce two new warnings +1. An optional edition warning, when updating to the next edition, that the meaning of `repr(C)` is changing. +2. A warning-by-default lint when `repr(ordered_fields)` is used on enums without the tag type specified. Since this is likely not what the user wanted. # Motivation [motivation]: #motivation From f11567c3bfb74d1ea8d71ca4bf4f43e82c9e2407 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Thu, 7 Aug 2025 08:20:59 -0700 Subject: [PATCH 17/28] edition migraiton lint -> edition compatibility lint --- text/3845-repr-ordered-fields.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 3db21c9dfe6..a7bed2723d1 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -42,7 +42,7 @@ Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for c `repr(C)` in edition <= 2024 is an alias for `repr(ordered_fields)` and in all other editions, it matches the default C compiler for the given target for structs, unions, and field-less enums. Enums with fields will be laid out as if they are a union of structs with the corresponding fields. -Using `repr(C)` in editions <= 2024 triggers a lint to use `repr(ordered_fields)` as an optional edition migration lint with a machine-applicable fix. If you are using `repr(C)` for FFI, then you may silence this lint. If you are using `repr(C)` for anything else, please switch over to `repr(ordered_fields)` so updating to future editions doesn't change the meaning of your code. +Using `repr(C)` in editions <= 2024 triggers a lint to use `repr(ordered_fields)` as an optional edition compatibility lint with a machine-applicable fix. If you are using `repr(C)` for FFI, then you may silence this lint. If you are using `repr(C)` for anything else, please switch over to `repr(ordered_fields)` so updating to future editions doesn't change the meaning of your code. ``` warning: use of `repr(C)` in type `Foo` @@ -265,7 +265,7 @@ fn get_layout_for_enum( The migration will be handled as follows: * after `repr(ordered_fields)` is implemented - * add an optional edition migration warning for `repr(C)` + * add an optional edition compatibility lint for `repr(C)` * this warning should be advertised publicly (maybe on the Rust Blog?), so that as many people use it. Since even if you are staying on edition <= 2024, it is helpful to switch to `repr(ordered_fields)` to make your intentions clearer * at this point both `repr(ordered_fields)` and `repr(C)` will have identical behavior * the warning will come with a machine-applicable fix From b17f19288be573d961518187c4bcfeecea163f59 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Thu, 7 Aug 2025 10:28:15 -0700 Subject: [PATCH 18/28] fix code example of layout algorithm for enums --- text/3845-repr-ordered-fields.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index a7bed2723d1..c8f62a1b6cf 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -241,7 +241,7 @@ fn get_layout_for_enum( let mut variant_field_offsets = Vec::new(); let mut variant_with_tag = Vec::new(); - // gives the smallest integer type which can represent the variants and the specified discriminants + // gives the tag used by `repr(C)` enums let tag_layout = get_layout_for_tag(discriminants); // ensure that the tag is the first field variant_with_tag.push(tag_layout); From 488068ef147bce4f5c21d44c626e8666d05ec783 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Mon, 11 Aug 2025 08:26:38 -0700 Subject: [PATCH 19/28] Add reference to MSVC bug in motivation --- text/3845-repr-ordered-fields.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index c8f62a1b6cf..4712f0172cc 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -10,7 +10,7 @@ Introduce a new `repr` (let's call it `repr(ordered_fields)`, but that can be bi Introduce two new warnings 1. An optional edition warning, when updating to the next edition, that the meaning of `repr(C)` is changing. -2. A warning-by-default lint when `repr(ordered_fields)` is used on enums without the tag type specified. Since this is likely not what the user wanted. +2. A warn-by-default lint when `repr(ordered_fields)` is used on enums without the tag type specified. Since this is likely not what the user wanted. # Motivation [motivation]: #motivation @@ -35,6 +35,9 @@ struct SomeFFI([i64; 0]); Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for case 1. They want it to be size 0 (as it currently is). +A tertiary motivation is to make progress on a work around for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480). This proposal doesn't attempt a complete solution for the bug, but it will be a necessary component of any solution to the bug. + +The issue here is that MSVC is inconsistent about the alignment of `u64`/`i64` (and possibly `f64`). In MSVC, the `alignof` macro reports an alignment 4 bytes, but in structs it is aligned to 8 bytes. And on these platforms, we report the alignment as 8 bytes. Any proper work around will require reducing the alignment of `u64`/`i64` to 4 bytes, and adjusting what `repr(C)` to treat `u64`/`i64`'s alignment as 8 bytes. This way if you have references/pointers to `u64`/`i64` (for example, as out pointers), then the Rust side will not break when the C side passes a 4-byte aligned pointer (but not 8-byte aligned). This could happen if the C side put the integer on the stack, or was manually allocated at some 4-byte alignment. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation From 93f53a5e54278d59d39fc996ae89286247419d62 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Mon, 11 Aug 2025 08:34:43 -0700 Subject: [PATCH 20/28] Add the suspicious_repr_c lint --- text/3845-repr-ordered-fields.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 4712f0172cc..44a66badc1f 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -11,6 +11,7 @@ Introduce a new `repr` (let's call it `repr(ordered_fields)`, but that can be bi Introduce two new warnings 1. An optional edition warning, when updating to the next edition, that the meaning of `repr(C)` is changing. 2. A warn-by-default lint when `repr(ordered_fields)` is used on enums without the tag type specified. Since this is likely not what the user wanted. +3. A warn-by-default lint when `repr(C)` is used, and there are no `extern` blocks or functions in the crate (on all editions). # Motivation [motivation]: #motivation @@ -59,6 +60,22 @@ warning: use of `repr(C)` in type `Foo` = note: `repr(C)` is planned to change meaning in the next edition to match the target platform's layout algorithm. This may change the layout of this type on certain platforms. To keep the current layout, switch to `repr(ordered_fields)` ``` +Using `repr(C)` on all editions (including > 2024) when there are no extern blocks or functions in the crate will trigger a warn-by-default lint suggesting to use `repr(ordered_fields)`. Since the most likely reason to do this is if you haven't heard of `repr(ordered_fields)` or are upgrading to the most recent Rust version (which now contains `repr(ordered_fields)`). + +If *any* extern block or function (including `extern "Rust"`) is used in the crate, then this lint will not be triggered. This way we don't have too many false positives for this lint. However, the lint should *not* suggest adding a `extern` block or function, since the problem is likely the `repr`. + +``` +warning: use of `repr(C)` in type `Foo` + --> src/main.rs:14:10 + | +14 | #[repr(C)] + | ^^^^^^^ help: consider switching to `repr(ordered_fields)` + | struct Foo { + | + = note: `#[warn(suspicious_repr_c)]` on by default + = note: `repr(C)` is intended for FFI, and since there are no `extern` blocks or functions, it's likely that you meant to use `repr(ordered_fields)` to get a stable and consistent layout for your type +``` + After enough time has passed, and the community has switched over: This makes it easier to tell *why* the `repr` was applied to a given struct. If `repr(C)`, it's about FFI and interop. If `repr(ordered_fields)`, then it's for a dependable layout. # Reference-level explanation @@ -326,6 +343,7 @@ See Rationale and Alternatives as well * Leaning towards no * Should we warn on `repr(ordered_fields)` when explicit tag type is missing (i.e. no `repr(u8)`/`repr(i32)`) * Since it's likely they didn't want the same tag type as `C`, and wanted the smallest possible tag type. +* What should the lints look like? (can be decided after stabilization if needed, but preferably this is hammered out before stabilization and after this RFC is accepted) # Future possibilities [future-possibilities]: #future-possibilities From a2737d52bad5469ebb02bfea6d7d506d187e2ea5 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Mon, 11 Aug 2025 08:36:42 -0700 Subject: [PATCH 21/28] specify precedence of `suspicious_repr_c` lint --- text/3845-repr-ordered-fields.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 44a66badc1f..d62b28418db 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -64,6 +64,8 @@ Using `repr(C)` on all editions (including > 2024) when there are no extern bloc If *any* extern block or function (including `extern "Rust"`) is used in the crate, then this lint will not be triggered. This way we don't have too many false positives for this lint. However, the lint should *not* suggest adding a `extern` block or function, since the problem is likely the `repr`. +The `suspicious_repr_c` lint takes precedence over `edition_2024_repr_c`. + ``` warning: use of `repr(C)` in type `Foo` --> src/main.rs:14:10 From bb8a392bf960d4486e3ba10de7bc2c5fa62df885 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Mon, 11 Aug 2025 08:41:32 -0700 Subject: [PATCH 22/28] minor grammer/punctuation fixes --- text/3845-repr-ordered-fields.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index d62b28418db..a51e85b545e 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -36,9 +36,9 @@ struct SomeFFI([i64; 0]); Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for case 1. They want it to be size 0 (as it currently is). -A tertiary motivation is to make progress on a work around for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480). This proposal doesn't attempt a complete solution for the bug, but it will be a necessary component of any solution to the bug. +A tertiary motivation is to make progress on a workaround for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480). This proposal doesn't attempt a complete solution for the bug, but it will be a necessary component of any solution to the bug. -The issue here is that MSVC is inconsistent about the alignment of `u64`/`i64` (and possibly `f64`). In MSVC, the `alignof` macro reports an alignment 4 bytes, but in structs it is aligned to 8 bytes. And on these platforms, we report the alignment as 8 bytes. Any proper work around will require reducing the alignment of `u64`/`i64` to 4 bytes, and adjusting what `repr(C)` to treat `u64`/`i64`'s alignment as 8 bytes. This way if you have references/pointers to `u64`/`i64` (for example, as out pointers), then the Rust side will not break when the C side passes a 4-byte aligned pointer (but not 8-byte aligned). This could happen if the C side put the integer on the stack, or was manually allocated at some 4-byte alignment. +The issue here is that MSVC is inconsistent about the alignment of `u64`/`i64` (and possibly `f64`). In MSVC, the `alignof` macro reports an alignment of 4 bytes, but in structs, it is aligned to 8 bytes. And on these platforms, we report the alignment as 8 bytes. Any proper work around will require reducing the alignment of `u64`/`i64` to 4 bytes, and adjusting what `repr(C)` to treat `u64`/`i64`'s alignment as 8 bytes. This way, if you have references/pointers to `u64`/`i64` (for example, as out pointers), then the Rust side will not break when the C side passes a 4-byte aligned pointer (but not 8-byte aligned). This could happen if the C side put the integer on the stack, or was manually allocated at some 4-byte alignment. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation From 612b99e4dbee0b2b10a986f56a2467b02dc7d7eb Mon Sep 17 00:00:00 2001 From: RustyYato Date: Mon, 11 Aug 2025 09:33:08 -0700 Subject: [PATCH 23/28] Fix MSVC bug description --- text/3845-repr-ordered-fields.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index a51e85b545e..c98ee2caf30 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -38,7 +38,9 @@ Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for c A tertiary motivation is to make progress on a workaround for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480). This proposal doesn't attempt a complete solution for the bug, but it will be a necessary component of any solution to the bug. -The issue here is that MSVC is inconsistent about the alignment of `u64`/`i64` (and possibly `f64`). In MSVC, the `alignof` macro reports an alignment of 4 bytes, but in structs, it is aligned to 8 bytes. And on these platforms, we report the alignment as 8 bytes. Any proper work around will require reducing the alignment of `u64`/`i64` to 4 bytes, and adjusting what `repr(C)` to treat `u64`/`i64`'s alignment as 8 bytes. This way, if you have references/pointers to `u64`/`i64` (for example, as out pointers), then the Rust side will not break when the C side passes a 4-byte aligned pointer (but not 8-byte aligned). This could happen if the C side put the integer on the stack, or was manually allocated at some 4-byte alignment. +The issue here is that MSVC is inconsistent about the alignment of `u64`/`i64` (and possibly `f64`). In MSVC, the alignment of `u64`/`i64` is reported to be 8 bytes by `alignof` and is correctly aligned in structs. However, when placed on the stack, MSVC doesn't ensure that they are aligned to 8-bytes, and may instead only align them to 4 bytes. + +Any proper work around will require reducing the alignment of `u64`/`i64` to 4 bytes, and adjusting what `repr(C)` to treat `u64`/`i64`'s alignment as 8 bytes. This way, if you have references/pointers to `u64`/`i64` (for example, as out pointers), then the Rust side will not break when the C side passes a 4-byte aligned pointer (but not 8-byte aligned). This could happen if the C side put the integer on the stack, or was manually allocated at some 4-byte alignment. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation From 283df469ec9128e40f311a2cecc9eab2453afffb Mon Sep 17 00:00:00 2001 From: RustyYato Date: Mon, 18 Aug 2025 10:19:58 -0700 Subject: [PATCH 24/28] Update text/3845-repr-ordered-fields.md Co-authored-by: +merlan #flirora --- text/3845-repr-ordered-fields.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index c98ee2caf30..02ae90d917b 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -103,7 +103,7 @@ The exact algorithm is deferred to whatever the default target C compiler does w > - edited version of the [reference](https://doc.rust-lang.org/stable/reference/type-layout.html#the-c-representation) on `repr(C)` ### struct Structs are laid out in memory in declaration order, with padding bytes added as necessary to preserve alignment. -And their alignment is the same as their most aligned field. +The alignment of a struct is the same as the alignment of the most aligned field. ```rust // assuming that u32 is aligned to 4 bytes From 8fe15764915700ac2fc46b72c7ccde9bf9f6e034 Mon Sep 17 00:00:00 2001 From: RustyYato Date: Mon, 18 Aug 2025 10:54:12 -0700 Subject: [PATCH 25/28] Update from reviews * update repr(ordered_fields) algorithm for `enum` * add to and update motivation section * add potential drawbacks to some lints --- text/3845-repr-ordered-fields.md | 73 +++++++++++++++++--------------- 1 file changed, 38 insertions(+), 35 deletions(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 02ae90d917b..feaca39cc6e 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -9,9 +9,9 @@ Introduce a new `repr` (let's call it `repr(ordered_fields)`, but that can be bikeshedded if this RFC is accepted) that can be applied to `struct`, `enum`, and `union` types, which guarantees a simple and predictable layout. Then provide an initial migration plan to switch users from `repr(C)` to `repr(ordered_fields)`. This allows restricting the meaning of `repr(C)` to just serve the FFI use-case. Introduce two new warnings -1. An optional edition warning, when updating to the next edition, that the meaning of `repr(C)` is changing. -2. A warn-by-default lint when `repr(ordered_fields)` is used on enums without the tag type specified. Since this is likely not what the user wanted. -3. A warn-by-default lint when `repr(C)` is used, and there are no `extern` blocks or functions in the crate (on all editions). +1. An edition migration warning, when updating to the next edition, that the meaning of `repr(C)` is changing +2. A warn-by-default lint when `repr(ordered_fields)` is used on enums without the tag type specified. Since this is likely not what the user wanted +3. A warn-by-default lint when `repr(C)` is used, and there are no `extern` blocks or functions in the crate (on all editions) # Motivation [motivation]: #motivation @@ -36,7 +36,13 @@ struct SomeFFI([i64; 0]); Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for case 1. They want it to be size 0 (as it currently is). -A tertiary motivation is to make progress on a workaround for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480). This proposal doesn't attempt a complete solution for the bug, but it will be a necessary component of any solution to the bug. +The next two cases will not be solved by this RFC, but this RFC will provide the necessary parts steps towards the respective fixes. + +This also plays a role in [#3718](https://github.com/rust-lang/rfcs/pull/3718), where `repr(C, packed(N))` wants allow fields which are `align(M)` (while making the `repr(C, ...)` struct less packed). This is a footgun for normal uses of `repr(packed)`, so it would be better to relegate this strictly to the FFI use-case. However, since `repr(C)` plays two roles, this is difficult. + +By splitting `repr(ordered_fields)` off of `repr(C)`, we can allow `repr(C, packed(N))` to contain over-aligned fields (while making the struct less packed), and (continuing to) disallow `repr(ordered_fields, packed(N))` from containing aligned fields. Thus keeping the Rust-only case free of warts, without compromising on FFI use-cases. + +The final motivation is to make progress on a workaround for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480). The issue here is that MSVC is inconsistent about the alignment of `u64`/`i64` (and possibly `f64`). In MSVC, the alignment of `u64`/`i64` is reported to be 8 bytes by `alignof` and is correctly aligned in structs. However, when placed on the stack, MSVC doesn't ensure that they are aligned to 8-bytes, and may instead only align them to 4 bytes. @@ -66,6 +72,8 @@ Using `repr(C)` on all editions (including > 2024) when there are no extern bloc If *any* extern block or function (including `extern "Rust"`) is used in the crate, then this lint will not be triggered. This way we don't have too many false positives for this lint. However, the lint should *not* suggest adding a `extern` block or function, since the problem is likely the `repr`. +This does miss one potential use-case, where a crate provides a suite of FFI-capable types, but does not actually provide any `extern` functions or blocks. This should be an extremely small minority of crates, and they can silence this warning crate-wide. + The `suspicious_repr_c` lint takes precedence over `edition_2024_repr_c`. ``` @@ -165,7 +173,7 @@ enum FooEnumUnsigned { } ``` -Enums with fields will be laid out as if they were a union of structs. +Enums with fields will be laid out as if they were a struct containing the tag and a union of structs containing the data. For example, this would be laid out the same as the union below ```rust @@ -182,7 +190,13 @@ enum BarEnum { ```rust #[repr(ordered_fields)] -union BarUnion { +struct BarEnumRepr { + tag: BarTag, + data: BarEnumData, +} + +#[repr(ordered_fields)] +union BarEnumData { var1: VarFieldless, var2: VarTuple, var3: VarStruct, @@ -196,16 +210,13 @@ enum BarTag { } #[repr(ordered_fields)] -struct VarFieldless { - tag: BarTag, -} +struct VarFieldless; #[repr(ordered_fields)] -struct VarTuple(BarTag, u8, u32); +struct VarTuple(u8, u32); #[repr(ordered_fields)] struct VarStruct { - tag: BarTag, a: u16, b: u32 } @@ -250,39 +261,31 @@ fn get_layout_for_union(field_layouts: &[Layout]) -> Result Ok(layout.pad_to_align()) } -/// Takes in the layout of each variant (and their fields) (in declaration order) -/// and returns the offsets of all fields of the enum, and the layout of the entire enum +/// Takes in the layout of each variant (and their fields) (in declaration order), and returns the layout of the entire enum +/// the offsets of all fields of the enum is left as an excersize for the readers /// NOTE: the enum tag is always at offset 0 fn get_layout_for_enum( // the discriminants may be negative for some enums // or u128::MAX for some enums, so there is no one primitive integer type which works. So BigInteger discriminants: &[BigInteger], variant_layouts: &[&[Layout]] -) -> Result<(Vec>, Layout), LayoutError> { +) -> Result { assert_eq!(discriminants.len(), variant_layouts.len()); + + let variant_data_layout = variant_layouts.iter() + .try_fold( + Layout::new::<()>(), + |acc, variant_layout| Ok(layout_max(acc, get_layout_for_struct(variant_layout)?.1)?) + )?; - let mut layout = Layout::new::<()>(); - let mut variant_field_offsets = Vec::new(); - - let mut variant_with_tag = Vec::new(); - // gives the tag used by `repr(C)` enums let tag_layout = get_layout_for_tag(discriminants); - // ensure that the tag is the first field - variant_with_tag.push(tag_layout); - - for &variant in variant_layouts { - variant_with_tag.truncate(1); - // put all other fields of the variant after this one - variant_with_tag.extend_from_slice(variant); - let (mut offsets, variant_layout) = get_layout_for_struct(&variant_with_tag)?; - // remove the tag so the caller only gets the fields they provided in order - offsets.remove(0); - - variant_field_offsets.push(offsets); - layout = layout_max(layout, variant_layout)?; - } - - Ok((variant_field_offsets, layout)) + + let (_, layout) = get_layout_for_struct(&[ + tag_layout, + variant_data_layout + ])?; + + Ok(layout) } ``` ### Migration to `repr(ordered_fields)` From 489b31a7de57c984cc8be80f34995760ad919f6c Mon Sep 17 00:00:00 2001 From: RustyYato Date: Mon, 18 Aug 2025 11:25:19 -0700 Subject: [PATCH 26/28] fix typo --- text/3845-repr-ordered-fields.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index feaca39cc6e..b5e47154cb0 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -36,7 +36,7 @@ struct SomeFFI([i64; 0]); Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for case 1. They want it to be size 0 (as it currently is). -The next two cases will not be solved by this RFC, but this RFC will provide the necessary parts steps towards the respective fixes. +The next two cases will not be solved by this RFC, but this RFC will provide the necessary steps towards the respective fixes. This also plays a role in [#3718](https://github.com/rust-lang/rfcs/pull/3718), where `repr(C, packed(N))` wants allow fields which are `align(M)` (while making the `repr(C, ...)` struct less packed). This is a footgun for normal uses of `repr(packed)`, so it would be better to relegate this strictly to the FFI use-case. However, since `repr(C)` plays two roles, this is difficult. From a5fb9bc5204dda45016583d2f672e928ff8cb2cb Mon Sep 17 00:00:00 2001 From: RustyYato Date: Mon, 18 Aug 2025 16:04:45 -0700 Subject: [PATCH 27/28] Fix typo Co-authored-by: Jacob Lifshay --- text/3845-repr-ordered-fields.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index b5e47154cb0..7c5a0777c45 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -262,7 +262,7 @@ fn get_layout_for_union(field_layouts: &[Layout]) -> Result } /// Takes in the layout of each variant (and their fields) (in declaration order), and returns the layout of the entire enum -/// the offsets of all fields of the enum is left as an excersize for the readers +/// the offsets of all fields of the enum is left as an exercise for the readers /// NOTE: the enum tag is always at offset 0 fn get_layout_for_enum( // the discriminants may be negative for some enums From 88f631e1288b77b8d9091fdd8cfb0f0ea754948f Mon Sep 17 00:00:00 2001 From: RustyYato Date: Wed, 20 Aug 2025 13:58:25 -0700 Subject: [PATCH 28/28] Add AIX to motivation --- text/3845-repr-ordered-fields.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/text/3845-repr-ordered-fields.md b/text/3845-repr-ordered-fields.md index 7c5a0777c45..3cdc8f7523d 100644 --- a/text/3845-repr-ordered-fields.md +++ b/text/3845-repr-ordered-fields.md @@ -42,11 +42,20 @@ This also plays a role in [#3718](https://github.com/rust-lang/rfcs/pull/3718), By splitting `repr(ordered_fields)` off of `repr(C)`, we can allow `repr(C, packed(N))` to contain over-aligned fields (while making the struct less packed), and (continuing to) disallow `repr(ordered_fields, packed(N))` from containing aligned fields. Thus keeping the Rust-only case free of warts, without compromising on FFI use-cases. -The final motivation is to make progress on a workaround for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480). +Splitting `repr(C)` also allows making progress on a workaround for the MSVC bug [rust-lang/rust/112480](https://github.com/rust-lang/rust/issues/112480) and a similar AIX [issue](https://internals.rust-lang.org/t/repr-c-aix-struct-alignment/21594). The issue here is that MSVC is inconsistent about the alignment of `u64`/`i64` (and possibly `f64`). In MSVC, the alignment of `u64`/`i64` is reported to be 8 bytes by `alignof` and is correctly aligned in structs. However, when placed on the stack, MSVC doesn't ensure that they are aligned to 8-bytes, and may instead only align them to 4 bytes. Any proper work around will require reducing the alignment of `u64`/`i64` to 4 bytes, and adjusting what `repr(C)` to treat `u64`/`i64`'s alignment as 8 bytes. This way, if you have references/pointers to `u64`/`i64` (for example, as out pointers), then the Rust side will not break when the C side passes a 4-byte aligned pointer (but not 8-byte aligned). This could happen if the C side put the integer on the stack, or was manually allocated at some 4-byte alignment. + +For AIX, the issue is that `f64` is treated as aligned to 4-bytes if it is not the first field in a struct. i.e. +```C +struct Foo { + char a; + double b; +} +``` +Field `b` would be laid out at offset 4, which is under-aligned (since `f64` has alignment 8 in Rust). Again, any proper workaround will require reducing the alignment of `f64`, and adjusting `repr(C)`. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation