Mojo module
flash_attention
comptime values
UnsafePointer
comptime UnsafePointer = LegacyUnsafePointer[?, address_space=?, origin=?]
Functions
-
flash_attention: -
flash_attention_kv_cache: -
flash_attention_split_kv: Variant of flash attention that takes the previous KV cacheinput_{k,v}_cache_fnand the current KV tensorsinput_k_fnandinput_v_fnas separate arguments.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!