For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
KVCacheIterator
struct KVCacheIterator[cache_t: MHAOperand, tile_size: Int, kv_num_heads: Int, depth: Int, cache_depth: Int = depth, head_dim_offset: Int = Int(0)]
TileTensor-based DRAM tile iterator.
Returns a TileTensor with Scalar for the row dimension (valid-row count) and ComptimeInt for depth and strides. No RuntimeLayout storage.
When cache_depth != depth, the DRAM stride uses cache_depth (e.g., MLA K_rope reads 64 columns from a 576-wide cache row). head_dim_offset shifts the column start (e.g., skip to rope portion at column 512).
Fieldsβ
- βcache (
cache_t): - βend (
Int): - βtile_start_row (
Int): - βbatch_idx (
Int): - βkv_head_idx (
Int):
Implemented traitsβ
comptime membersβ
GmemTileLayoutβ
comptime GmemTileLayout = Layout[*?, *?]
GmemTileTypeβ
comptime GmemTileType = TileTensor[cache_t.dtype, Layout[*?, *?], ImmutAnyOrigin]
Methodsβ
__init__β
def __init__(out self, cache: cache_t, batch_idx: Int, kv_head_idx: Int, end: Int)
next_tileβ
def next_tile(mut self) -> Self.GmemTileType
Returns a TileTensor for the next DRAM tile.
Returns:
Self.GmemTileType
incrementβ
def increment(mut self)
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!