Skip to content

Commit 4b040a2

Browse files
committed
Address feedback and questions
1 parent b9e6849 commit 4b040a2

File tree

1 file changed

+80
-6
lines changed

1 file changed

+80
-6
lines changed

text/3844-next-gen-transmute.md

Lines changed: 80 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
- Feature Name: `next_gen_transmute`
2-
- Start Date: (fill me in with today's date, YYYY-MM-DD)
2+
- Start Date: 2025-08-01
33
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
44
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
55

@@ -31,6 +31,12 @@ to match, and thus there's no opportunity for the compiler to help catch a mista
3131
expectation. Plus it obfuscates other locations that really do want `transmute_copy`,
3232
perhaps because they're intentionally reading a prefix out of something.
3333

34+
It's also a safety footgun because it'll *compile* if you instead were to write
35+
```rust
36+
mem::transmute_copy(&other)
37+
```
38+
but is highly likely to result in use-after-free UB.
39+
3440
It would be nice to move `mem::transmute` to being a normal function -- not the one
3541
intrinsic we let people call directly -- in a way that it can be more flexible for
3642
users as well as easier to update in the compiler without semver worries.
@@ -208,10 +214,15 @@ could be implemented and tested before doing the publicly-visible switchover.
208214

209215
Well, there's two big reasons to prefer post-mono here:
210216

211-
1. By being post-mono, it's 100% accurate. Sure, if we could easily be perfectly
212-
accurate earlier in the pipeline that would be nice, but since it's layout-based
213-
that's extremely difficult at best because of layering. (For anything related
214-
to coroutines that's particularly bad.)
217+
1. By being post-mono, it eliminates all "the compiler isn't smart enough" cases.
218+
If you get an error from it, then the two types are *definitely* of different
219+
sizes, *period*. If you find a way to encode Fermat's Last Theorem in the
220+
type system, it's ok, the compiler doesn't have to know how to prove it to let
221+
you do the transmute. It would be *nice* if we could be that accurate earlier
222+
in the compilation pipeline, but for anything layout-based that's extremely
223+
difficult -- especially for futures. There's still the potential for "false"
224+
warnings in code that's only conditionally run, but that's also true of trait
225+
checks, and is thus left for a different RFC to think about.
215226
2. By being *hard* errors, rather than lints, there's a bunch more breaking change
216227
concerns. Any added smarts that allow something to compile need to be around
217228
*forever* (as removing them would be breaking), and similarly the exact details
@@ -236,18 +247,81 @@ Plus using `union`s for type punning like this is something that people already
236247
do, so having a name for it helps make what's happening more obvious, plus gives
237248
a place for us to provider better documentation and linting when they do use it.
238249

250+
## Why the name `union_transmute`?
251+
252+
The idea is to lean on the fact that Rust already has `union` as a user-visible
253+
concept, since what this does is *exactly* the same as using an
254+
all-fields-at-the-same-offset `union` to reinterpret the representation.
255+
Similarly, a common way to do this operation in C is to use a union, so people
256+
coming from other languages will recognize it.
257+
258+
Thinking about the `union` hopefully also give people the right intuition about
259+
the requirements that this has, especially in comparison to what the requirements
260+
would be if this had the pointer-cast semantics. Hopefully seeing the union in
261+
the name helps them *not* think that it's just `(&raw const x).cast().read()`.
262+
263+
There's currently (as an implementation detail) a `transmute_unchecked` intrinsic
264+
in rustc which doesn't have the typeck-time size check, but I leaned away from
265+
that name because it's unprecedented, to my knowledge, to have a `foo_unchecked`
266+
in the stable library where `foo` is *also* an `unsafe fn`.
267+
268+
If we were in a world where `mem::transmute` was actually a *safe* function,
269+
then `transmute_unchecked` for this union-semantic version would make sense,
270+
but we don't currently have such a thing.
271+
272+
## Could we keep the compile-time checks on old editions?
273+
274+
This RFC is written assuming that we'll be able to remove the type-checking-time
275+
special behaviour entirely. That does mean that some things that used to fail
276+
will start to compile, and it's possible that people were writing code depending
277+
on that kind of trickery to enforce invariants.
278+
279+
However, there's never been a guarantee about what exactly those checks enforce,
280+
and in general we're always allowed to make previously-no-compiling things start
281+
to compile in new versions -- as has happened before with the check getting
282+
smarter. We're likely fine saying that such approaches were never endorsed and
283+
thus that libraries should move to other mechanisms to check sizes, as
284+
[some ecosystem crates](https://github.com/Lokathor/bytemuck/pull/320) have
285+
already started to do.
286+
287+
If for some reason that's not ok, we could consider approaches like
288+
edition-specific name resolution to have `mem::transmute` on edition ≤ 2024
289+
continue to get the typeck hacks for this, but on future editions resolve to
290+
the version using the interior const-assert instead.
291+
292+
## Is transmuting to something bigger ever *not* UB?
293+
294+
As a simple case, if you have
295+
296+
```rust
297+
#[repr(C, align(4))]
298+
struct AlignedByte(u8);
299+
```
300+
301+
then `union_transmute::<u8, AlignedByte>` and `union_transmute::<AlignedByte, u8>`
302+
are in fact *both* always sound, despite the sizes never matching.
303+
304+
You can easily make other similar examples using `repr(packed)` as well.
305+
239306

240307
# Prior art
241308
[prior-art]: #prior-art
242309

243-
Unknown.
310+
C++ has `reinterpret_cast` which sounds like it'd be similar, but which isn't
311+
defined for aggregates, just between integers and pointers or between pointers
312+
and other pointers.
313+
314+
GCC has a cast-to-union extension, but it only goes from a value to a `union`
315+
with a field of matching type, and doesn't include the part of going from the
316+
`union` back to a different field.
244317

245318

246319
# Unresolved questions
247320
[unresolved-questions]: #unresolved-questions
248321

249322
During implementation:
250323
- Should MIR's `CastKind::Transmute` retain its equal-size precondition?
324+
- What name should the new function get?
251325

252326
For nightly and continuing after stabilization:
253327
- What exactly are the correct lints to have about these functions?

0 commit comments

Comments
 (0)