Skip to main content

Mojo module

paged_sparse_kv_index_remap

Map logical sequence token indices to MLA sparse physical row encoding.

Sparse MLA kernels expect each selected key position as::

Int32(physical_block_id * page_size + token_offset_within_page)

where physical_block_id comes from the paged lookup_table. The indexer instead emits logical positions t in [0, cache_length). This module implements that remapping on GPU (or CPU) without device↔host staging of the full sparse index or LUT tensors.

Invalid sparse slots conventionally use -1 and are copied through. If the LUT entry is >= invalid_block_id (runtime sentinel total_num_pages), the output slot is written -1.

Functions