Skip to main content

Python class

KVCacheInputsPerDevice

KVCacheInputsPerDevice

class max.nn.kv_cache.KVCacheInputsPerDevice(blocks, cache_lengths, lookup_table, max_lengths, kv_scales=None, attention_dispatch_metadata=None)

source

Bases: object

Holds the concrete KV cache buffer inputs for a single device.

Parameters:

as_list()

as_list()

source

Returns the non-None KV cache buffers in ABI order.

Returns:

A list of Buffer objects containing blocks, cache_lengths, lookup_table, max_lengths, and optionally kv_scales and attention_dispatch_metadata.

Return type:

list[Buffer]

attention_dispatch_metadata

attention_dispatch_metadata: Buffer | None = None

source

blocks

blocks: Buffer

source

cache_lengths

cache_lengths: Buffer

source

kv_scales

kv_scales: Buffer | None = None

source

lookup_table

lookup_table: Buffer

source

max_lengths

max_lengths: Buffer

source