For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo module

paged_sparse_kv_index_remap

Map logical sequence token indices to MLA sparse physical row encoding.

Sparse MLA kernels expect each selected key position as::

Int32(physical_block_id * page_size + token_offset_within_page)

where physical_block_id comes from the paged lookup_table. The indexer instead emits logical positions t in [0, cache_length). This module implements that remapping on GPU (or CPU) without device↔host staging of the full sparse index or LUT tensors.

Invalid sparse slots conventionally use -1 and are copied through. If the LUT entry is >= invalid_block_id (runtime sentinel total_num_pages), the output slot is written -1.

Functions

paged_sparse_kv_index_remap: High-level remap for sparse MLA MOGG ops (logical indices → physical rows).
paged_sparse_kv_logical_to_physical_indices_from_row_offsets_dispatch: Remap logical sparse slots using ragged input_row_offsets (not per-slot batch ids).

Functions​

Functions