Mojo module
paged_sparse_kv_index_remap
Map logical sequence token indices to MLA sparse physical row encoding.
Sparse MLA kernels expect each selected key position as::
Int32(physical_block_id * page_size + token_offset_within_page)where physical_block_id comes from the paged lookup_table. The indexer
instead emits logical positions t in [0, cache_length). This module
implements that remapping on GPU (or CPU) without device↔host staging of the
full sparse index or LUT tensors.
Invalid sparse slots conventionally use -1 and are copied through.
If the LUT entry is >= invalid_block_id (runtime sentinel total_num_pages),
the output slot is written -1.
Functions
-
paged_sparse_kv_index_remap: High-level remap for sparse MLA MOGG ops (logical indices → physical rows). -
paged_sparse_kv_logical_to_physical_indices_from_row_offsets_dispatch: Remap logical sparse slots using raggedinput_row_offsets(not per-slot batch ids).
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!