Skip to main content

Python class

PagedCacheValues

PagedCacheValues

class max.nn.kv_cache.PagedCacheValues(kv_blocks, cache_lengths, lookup_table, max_lengths, kv_scales=None, dispatch_metadata=None)

source

Bases: NestedIterableDataclass[BufferValue | TensorValue]

Concrete graph values for a single device’s paged KV cache.

Parameters:

cache_lengths

cache_lengths: TensorValue

source

dispatch_metadata

dispatch_metadata: AttentionDispatchMetadata[TensorValue] | None = None

source

kv_blocks

kv_blocks: BufferValue

source

kv_scales

kv_scales: BufferValue | None = None

source

lookup_table

lookup_table: TensorValue

source

max_lengths

max_lengths: TensorValue

source