For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

sparse_indexer_prefill_topk

def sparse_indexer_prefill_topk[num_index_heads: Int, block_size: Int](input_row_offsets: TileTensor[DType.uint32, address_space=input_row_offsets.address_space, linear_idx_type=input_row_offsets.linear_idx_type, element_size=input_row_offsets.element_size], prefix_lens: TileTensor[DType.uint32, address_space=prefix_lens.address_space, linear_idx_type=prefix_lens.linear_idx_type, element_size=prefix_lens.element_size], score: TileTensor[DType.float32, address_space=score.address_space, linear_idx_type=score.linear_idx_type, element_size=score.element_size], out_idxs: TileTensor[DType.int32, address_space=out_idxs.address_space, linear_idx_type=out_idxs.linear_idx_type, element_size=out_idxs.element_size], batch: Int, total_q: Int, max_num_blocks: Int, topk: Int, ctx: DeviceContext)

Launch the prefill top-k selection kernel from score into out_idxs.

See sparse_indexer_prefill for the argument contract. Exposed separately so tests can drive scoring and selection independently.