Mojo module
rope_split_store
Fused rope + split + KV store kernel.
Reads a flat QKV matmul output, applies RoPE to Q and K regions, stores K/V to the paged KV cache, and writes roped Q to the output buffer โ all in a single GPU kernel to eliminate intermediate tensor round-trips.
Functionsโ
- โ
rope_split_store_paged_ragged: Rope+split+store with paged KV cache collection.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!