Mojo module
kv_cache_ragged
Functions
-
generic_cross_attention_kv_cache_null_mask_cont_batch_ragged
: -
generic_flare_mla_decode_kv_cache_causal_mask_paged_ragged
: -
generic_flare_mla_decompress_k_cache_ragged_paged
: -
generic_flare_mla_prefill_kv_cache_causal_mask_paged_ragged
: -
generic_flare_mla_prefill_ragged_paged_plan
: -
generic_flash_attention_kv_cache_alibi_mask_cont_batch_ragged
: -
generic_flash_attention_kv_cache_causal_mask_cont_batch_ragged
: -
generic_flash_attention_kv_cache_causal_mask_paged_ragged
: -
generic_flash_attention_kv_cache_chunked_causal_mask_cont_batch_ragged
: -
generic_flash_attention_kv_cache_chunked_causal_mask_paged_ragged
: -
generic_flash_attention_kv_cache_null_mask_cont_batch_ragged
: -
generic_flash_attention_kv_cache_sliding_window_causal_mask_cont_batch_ragged
: -
generic_flash_attention_kv_cache_sliding_window_causal_mask_paged_ragged
: -
generic_fused_qk_rope_bshd_continous_batch_ragged
: -
generic_fused_qk_rope_bshd_paged_ragged
: Performs a fused RoPE projection for Q and K projections. -
generic_fused_qkv_matmul_kv_cache_cont_batch_ragged
: Performs a fused QKV matmul. Q outputs are written to the output argument while K and V outputs are written in-place into k_cache and v_cache. -
generic_fused_qkv_matmul_kv_cache_paged_fa3_fallback_ragged
: Performs a fused QKV matmul. Q outputs are written to the output argument while K and V outputs are written in-place into k_cache and v_cache. -
generic_fused_qkv_matmul_kv_cache_paged_ragged
: Performs a fused QKV matmul. Q outputs are written to the output argument while K and V outputs are written in-place into k_cache and v_cache. -
generic_fused_qkv_matmul_kv_cache_paged_ragged_bias
: Performs a fused QKV matmul. Q outputs are written to the output argument while K and V outputs are written in-place into k_cache and v_cache. -
generic_fused_qkv_matmul_kv_cache_paged_ragged_scale
: Performs a fused QKV matmul. Q outputs are written to the output argument while K and V outputs are written in-place into k_cache and v_cache. -
k_matmul_ragged_paged
: Performs a matmul, writing the output into a mutable PagedKVCacheCollection object. -
kv_matmul_ragged_continuous_batching
: Performs a matmul, writing the output into a mutable ContinuousBatchingKVCacheCollection object. -
unfused_qkv_matmul_ragged_continuous_batching_gguf_quantized
: Performs a quantized matmul, writing the output into a mutable ContinuousBatchingKVCacheCollection object.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!