Mojo module
flash_attention
Functionsβ
- βflash_attention:
- βflash_attention_kv_cache:
- βflash_attention_split_kv: Variant of flash attention that takes the previous KV cacheinput_{k,v}_cache_fnand the current KV tensorsinput_k_fnandinput_v_fnas separate arguments.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!
