Skip to content

Conversation

rockerBOO
Copy link
Contributor

@rockerBOO rockerBOO commented Apr 22, 2025

Note have not tried this myself yet. Same implementation as I have for flux.

Flash Attention maybe won't work because we are using an attention mask.
cuDNN may have issues but you can enable it via TORCH_CUDNN_SDPA_ENABLED=1.

FLASH_ATTENTION: The flash attention backend for scaled dot product attention.
CUDNN_ATTENTION: The cuDNN backend for scaled dot product attention.
EFFICIENT_ATTENTION: The efficient attention backend for scaled dot product attention.
MATH: The math backend for scaled dot product attention.

Related: #2045

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant