Mojo module
kv_cache_ragged
Functionsβ
- β
generic_cross_attention_kv_cache: - β
generic_flare_mla_decode_kv_cache_ragged: - β
generic_flare_mla_decompress_k_cache_ragged_paged: - β
generic_flare_mla_prefill_kv_cache_ragged: - β
generic_flare_mla_prefill_ragged_paged_plan: - β
generic_flash_attention_kv_cache_ragged: - β
generic_flash_attention_kv_cache_ragged_sink: - β
generic_fused_qk_rope_bshd_paged_ragged: Performs a fused RoPE projection for Q and K projections. - β
generic_fused_qkv_matmul_kv_cache_paged_ragged: Performs a fused QKV matmul. Q outputs are written to the output argument while K and V outputs are written in-place into k_cache and v_cache. - β
generic_fused_qkv_matmul_kv_cache_paged_ragged_bias: Performs a fused QKV matmul. Q outputs are written to the output argument while K and V outputs are written in-place into k_cache and v_cache. - β
generic_fused_qkv_matmul_kv_cache_paged_ragged_scale: Performs a fused QKV matmul. Q outputs are written to the output argument while K and V outputs are written in-place into k_cache and v_cache. - β
generic_fused_qkv_matmul_kv_cache_paged_ragged_scale_float4: Performs a fused QKV matmul. Q outputs are written to the output argument while K and V outputs are written in-place into k_cache and v_cache. - β
generic_kv_cache_radd_dispatch: - β
k_matmul_ragged_paged: Performs a matmul, writing the output into a mutable PagedKVCacheCollection object. - β
k_matmul_ragged_paged_scale: Performs a matmul, writing the output into a mutable PagedKVCacheCollection object. - β
kv_cache_2m_iadd_dispatch: In-place add to paged KV cache with concatenated K/V layout. This kernel is only used for LoRA. - β
kv_cache_store_padded: - β
kv_cache_store_ragged: - β
kv_matmul_ragged_paged: Performs a matmul, writing the output into a mutable ContinuousBatchingKVCacheCollection object. - β
unfused_qkv_matmul_ragged_paged_gguf_quantized: Performs a quantized matmul, writing the output into a mutable PagedKVCacheCollection object.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!