Enable fast attention in nanoPET #454

frostedoyster · 2025-01-26T10:26:56Z

📚 Documentation preview 📚: https://metatrain--454.org.readthedocs.build/en/454/

spozdn · 2025-01-26T17:50:03Z

I think backward compatible proper implementation is on the way. We can discuss it soon.

upd. not in the backward compatible sense as later clarified, I meant supporting backward pass

abmazitov · 2025-01-26T17:56:30Z

Does it give the same result as the original version of the code? If so, I would be quite surprised

abmazitov · 2025-01-26T17:58:00Z

What do you mean by saying that it’s not backward compatible? I don’t see why backward pass should not work

Luthaf · 2025-01-27T11:06:08Z

What do you mean by saying that it’s not backward compatible? I don’t see why backward pass should not work

Backward compatible in the API-sense, i.e. that the same checkpoint will produce the same results without retraining.

frostedoyster · 2025-01-28T17:18:03Z

@spozdn I think that using a custom attn_mask here

attn_output = F.scaled_dot_product_attention(q, k, v, attn_mask=attn_mask)

will prevent torch from using flash attention

You'll find more details here, where they say that they only implement full (all True) and causal masks https://github.com/Dao-AILab/flash-attention

And here (https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html), where they say that
"All implementations are enabled by default. Scaled dot product attention attempts to automatically select the most optimal implementation based on the inputs."

Anyway, this can be used to check which one is used:
https://pytorch.org/docs/stable/generated/torch.nn.attention.sdpa_kernel.html#torch.nn.attention.sdpa_kernel

EDIT: Flash attention doesn't support a custom mask, but CUDNN attention does, so we'll go with it

Weird trick to enable flash attention in nanoPET

a81fb70

frostedoyster requested review from Luthaf, spozdn and abmazitov January 26, 2025 10:26

frostedoyster added the Discussion Issues to be discussed by the contributors label Jan 26, 2025

attention cutoff -inf

18496ed

frostedoyster force-pushed the flash-attention-nanopet branch from 4274feb to df0587e Compare January 31, 2025 08:10

frostedoyster changed the title ~~Weird trick to enable flash attention in nanoPET~~ Enable fast attention in nanoPET Jan 31, 2025

New attempt

f87835e

frostedoyster force-pushed the flash-attention-nanopet branch from df0587e to f87835e Compare January 31, 2025 08:14

Merge branch 'main' into flash-attention-nanopet

23e94eb

frostedoyster removed the Discussion Issues to be discussed by the contributors label Jan 31, 2025

frostedoyster force-pushed the flash-attention-nanopet branch 6 times, most recently from 3372801 to 4541da7 Compare January 31, 2025 14:14

Debug

3be2a43

frostedoyster force-pushed the flash-attention-nanopet branch 2 times, most recently from f5df1b4 to 9f846db Compare January 31, 2025 14:43

Whatever

87704fe

frostedoyster force-pushed the flash-attention-nanopet branch from 9f846db to 87704fe Compare January 31, 2025 14:45

frostedoyster and others added 2 commits January 31, 2025 15:50

Priority to float32 for nanoPET

0d5c2cb

Merge branch 'main' into flash-attention-nanopet

440ffbf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable fast attention in nanoPET #454

Enable fast attention in nanoPET #454

Uh oh!

frostedoyster commented Jan 26, 2025 •

edited

Loading

Uh oh!

spozdn commented Jan 26, 2025 •

edited

Loading

Uh oh!

abmazitov commented Jan 26, 2025

Uh oh!

abmazitov commented Jan 26, 2025

Uh oh!

Luthaf commented Jan 27, 2025

Uh oh!

frostedoyster commented Jan 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Enable fast attention in nanoPET #454

Are you sure you want to change the base?

Enable fast attention in nanoPET #454

Uh oh!

Conversation

frostedoyster commented Jan 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

spozdn commented Jan 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abmazitov commented Jan 26, 2025

Uh oh!

abmazitov commented Jan 26, 2025

Uh oh!

Luthaf commented Jan 27, 2025

Uh oh!

frostedoyster commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

frostedoyster commented Jan 26, 2025 •

edited

Loading

spozdn commented Jan 26, 2025 •

edited

Loading

frostedoyster commented Jan 28, 2025 •

edited

Loading