Skip to main content

Mojo module

rope_split_store

Fused rope + split + KV store kernel.

Reads a flat QKV matmul output, applies RoPE to Q and K regions, stores K/V to the paged KV cache, and writes roped Q to the output buffer โ€” all in a single GPU kernel to eliminate intermediate tensor round-trips.

Functionsโ€‹

Was this page helpful?