Skip to content

BADHACK: gles/egl: Require surface before creating instance / query adapter on Wayland #7969

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

MarijnS95
Copy link
Contributor

@MarijnS95 MarijnS95 commented Jul 19, 2025

Connections
Fixes #5505
Could address #3176 if we make Inner an Option
Fixes #7965

Description

In egl, Inner has been made Clone in #2350, and prior to that some EGL fields were copied into AdapterContext since #1729.

At the same time wgpu to this day doesn't seem to accept a RawDisplayHandle when creating an Instance, meaning that on Wayland at least we end up creating a bogus EGLDisplay (and respective GL context) that isn't compatible with the window/surface that the user later wants wgpu to present to (not to mention, the user/caller has zero control over the Wayland or other connection to use, this is all hardcoded based on library detection and connecting to the default Wayland socket...).

Each time wl_display_connect() is called, which is likely also what happens internally when EGL_DEFAULT_DISPLAY is passed for EGL_PLATFORM_WAYLAND_KHR, a different and incompatible display connection is used. Even if wgpu's test_wayland_display() would have connected to the same Wayland server as eglGetPlatformDisplay(WAYLAND, DEFAULT_DISPLAY) and the one the caller uses for its surface, these cannot be used interchangeably.

To patch up this bogus display, gfx-rs/gfx#3545 inserted some logic to replace the internal EGLDisplay and GL context whenever a surface is created which does receive a RawDisplayHandle with the correct wl_display handle to use to be compatible with the given wl_surface:

wgpu/wgpu-hal/src/gles/egl.rs

Lines 991 to 1034 in 96ef5ec

(Rwh::Wayland(_), raw_window_handle::RawDisplayHandle::Wayland(display_handle)) => {
if inner
.wl_display
.map(|ptr| ptr != display_handle.display.as_ptr())
.unwrap_or(true)
{
/* Wayland displays are not sharable between surfaces so if the
* surface we receive from this handle is from a different
* display, we must re-initialize the context.
*
* See gfx-rs/gfx#3545
*/
log::warn!("Re-initializing Gles context due to Wayland window");
use core::ops::DerefMut;
let display_attributes = [khronos_egl::ATTRIB_NONE];
let display = unsafe {
inner
.egl
.instance
.upcast::<khronos_egl::EGL1_5>()
.unwrap()
.get_platform_display(
EGL_PLATFORM_WAYLAND_KHR,
display_handle.display.as_ptr(),
&display_attributes,
)
}
.unwrap();
let new_inner = Inner::create(
self.flags,
Arc::clone(&inner.egl.instance),
display,
inner.force_gles_minor_version,
)?;
let old_inner = core::mem::replace(inner.deref_mut(), new_inner);
inner.wl_display = Some(display_handle.display.as_ptr());
drop(old_inner);
}
}

Now all the way back to the first sentence, about cloning a bunch of EGL handles. The problem mainly occurs when the user either enumerated Adapters before this operation, and worse if they started creating resources on it. Because of this clone:

wgpu/wgpu-hal/src/gles/egl.rs

Lines 1100 to 1103 in 96ef5ec

AdapterContext {
glow: Mutex::new(gl),
egl: Some(inner.egl.clone()),
},

Those adapters have the old EGLDisplay and old GL context. Which we just closed when replacing Inner with the "new wl_display-based" EGLDisplay and context:

impl Drop for Inner {
fn drop(&mut self) {
if let Err(e) = self
.egl
.instance
.destroy_context(self.egl.display, self.egl.raw)
{
log::warn!("Error in destroy_context: {e:?}");
}
if let Err(e) = terminate_display(&self.egl.instance, self.egl.display) {
log::warn!("Error in terminate: {e:?}");
}
}
}

Whenever the user now uses the adapter to make the context current, a BadDisplay / BadContext error is generated because the context and display handles are both destroyed.

To solve that, one of the proposed solutions was to follow what Web does: require a surface when enumerating Adapters so that the right EGLDisplay is copied over. This is however overly restrictive:

  • After creating a surface, the internals are updated and a surface no longer needs to be passed when enumerating surfaces.
  • A surface (typically from a window) shouldn't be needed at all, if merely the RawDisplayHandle could be passed.

For now, see if this unblocks folks while we work on a more likable solution. Stay tuned.

Testing
amdgpu with Wayland compositor, with e.g.;

WGPU_BACKEND=gl cargo r -p wgpu-examples -- shadow

Squash or Rebase?

rebase

Checklist

  • Run cargo fmt.
  • Run taplo format.
  • Run cargo clippy --tests. If applicable, add:
    • --target wasm32-unknown-unknown
  • Run cargo xtask test to run tests.
  • If this contains user-facing changes, add a CHANGELOG.md entry.

@MarijnS95 MarijnS95 force-pushed the egl-force-surface-before-adapter-enumerate-on-wayland branch from 3389cb5 to 12c42fd Compare July 22, 2025 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EGL BadDisplay when using Wayland
1 participant