Mojo module
rope_split_store
Fused rope + split + KV store kernel.
Reads a flat QKV matmul output, applies RoPE to Q and K regions, stores K/V to the paged KV cache, and writes roped Q to the output buffer β all in a single GPU kernel to eliminate intermediate tensor round-trips.
Functionsβ
- β
rope_split_store_paged_ragged: Rope+split+store with paged KV cache collection. - β
rope_split_store_paged_ragged_with_position_ids: Rope+split+store with paged KV cache and explicit position IDs.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!