Skip to content

[CIR][CIRGen][Builtin][X86] lower undef intrinsics #1775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 7, 2025

Conversation

RiverDave
Copy link
Collaborator

We have seem to be generating extra load/store instructions as noted here for all null values encoded in this patch:

CIR:

define dso_local <8 x bfloat> @test_mm_undefined_pbh() #0 {
  %1 = alloca <8 x bfloat>, i64 1, align 16
  %2 = alloca <8 x bfloat>, i64 1, align 16
  store <8 x bfloat> zeroinitializer, ptr %1, align 16
  %3 = load <8 x bfloat>, ptr %1, align 16
  store <8 x bfloat> %3, ptr %2, align 16
  %4 = load <8 x bfloat>, ptr %2, align 16
  ret <8 x bfloat> %4
}

whereas OG:

define dso_local <8 x bfloat> @test_mm_undefined_pbh() #0 {
entry:
  ret <8 x bfloat> zeroinitializer
}

Copy link
Collaborator

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me.

The behavior you mention in the comments with the extra alloca instructions is an artifact of CIR always generating an alloca for the return value, combined with most optimizations being disabled. The x86 instrinsics are defined as functions with the always_inline attribute set, so unless you pass -disable-llvm-passes on the command-line, the always inline pass will inline the function, but no other optimizations are being run, so the allocas aren't cleaned up. You can see that with a simple case here:

https://godbolt.org/z/KcWdvP3r4

It's a benign behavior, generally speaking. At some point we may want to do something about the return allocas in a pre-lowering transform.

@bcardosolopes bcardosolopes merged commit 27315ab into llvm:main Aug 7, 2025
7 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants