IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

KVCacheIterator

struct KVCacheIterator[cache_t: MHAOperand, tile_size: Int, kv_num_heads: Int, depth: Int, cache_depth: Int = depth, head_dim_offset: Int = 0]

TileTensor-based DRAM tile iterator.

Returns a TileTensor with Scalar for the row dimension (valid-row count) and ComptimeInt for depth and strides. No RuntimeLayout storage.

When cache_depth != depth, the DRAM stride uses cache_depth (e.g., MLA K_rope reads 64 columns from a 576-wide cache row). head_dim_offset shifts the column start (e.g., skip to rope portion at column 512).

Fields​

  • ​cache (cache_t):
  • ​end (Int):
  • ​tile_start_row (Int):
  • ​batch_idx (Int):
  • ​kv_head_idx (Int):

Implemented traits​

AnyType, ImplicitlyDeletable

comptime members​

GmemTileLayout​

comptime GmemTileLayout = Layout[*?, *?]

GmemTileType​

comptime GmemTileType = TileTensor[cache_t.dtype, Layout[*?, *?], ImmutAnyOrigin]

Methods​

__init__​

def __init__(out self, cache: cache_t, batch_idx: Int, kv_head_idx: Int, end: Int)

next_tile​

def next_tile(mut self) -> Self.GmemTileType

Returns a TileTensor for the next DRAM tile.

Returns:

Self.GmemTileType

increment​

def increment(mut self)