-
Notifications
You must be signed in to change notification settings - Fork 1
fix mask with flash attn #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@Green-Sky I think something is wrong somewhere. I get black images, both on Vulkan and CPU (using updated ggml for both) when using masking. |
With flash attention or without? |
With flash attention. Without masking, fa works fine, and without fa, masking doesn't cause issues. |
Oh, updated ggml, hmm. time to look at all changes? |
On outdated (current) GGML, fa doesn't work at all on Vulkan, but CPU results are still black. Can you reproduce it (with low resolution , 1step and cfg_scale 1)? |
How small? because 64x64 just crashes. |
I meant the standard 512x512. |
Interesting, cpu mask fa is pitch black for me. |
I'll suppose its an upstream GGML problem, let's merge it for Cuda users |
asan is clean too. (beside the memory leak, which might be worth fixing btw)
|
Looks like upstream does not test values in the mask for flash attention, only the fact that a mask exists (with nothing masked). |
pad requirement was
2816
for chroma/flux mask.