Skip to main content

Mojo struct

KVCacheIterator

struct KVCacheIterator[cache_t: MHAOperand, tile_size: Int, kv_num_heads: Int, depth: Int, cache_depth: Int = depth, head_dim_offset: Int = 0]

TileTensor-based DRAM tile iterator.

Returns a TileTensor with RuntimeInt for the row dimension (valid-row count) and ComptimeInt for depth and strides. No RuntimeLayout storage.

When cache_depth != depth, the DRAM stride uses cache_depth (e.g., MLA K_rope reads 64 columns from a 576-wide cache row). head_dim_offset shifts the column start (e.g., skip to rope portion at column 512).

Fields​

  • ​cache (cache_t):
  • ​end (Int):
  • ​tile_start_row (Int):
  • ​batch_idx (Int):
  • ​kv_head_idx (Int):

Implemented traits​

AnyType, ImplicitlyDestructible

comptime members​

GmemTileLayout​

comptime GmemTileLayout = Layout[*?, *?]

GmemTileType​

comptime GmemTileType = TileTensor[cache_t.dtype, Layout[*?, *?], ImmutAnyOrigin]

Methods​

__init__​

__init__(out self, cache: cache_t, batch_idx: Int, kv_head_idx: Int, end: Int)

next_tile​

next_tile(mut self) -> KVCacheIterator[cache_t, tile_size, kv_num_heads, depth, cache_depth, head_dim_offset].GmemTileType

Returns a TileTensor for the next DRAM tile.

Returns:

KVCacheIterator[cache_t, tile_size, kv_num_heads, depth, cache_depth, head_dim_offset].GmemTileType

increment​

increment(mut self)