Python class
PagedCacheValues
PagedCacheValues
class max.nn.kv_cache.PagedCacheValues(kv_blocks, cache_lengths, lookup_table, max_lengths, kv_scales=None, dispatch_metadata=None)
Bases: NestedIterableDataclass[BufferValue | TensorValue]
Concrete graph values for a single device’s paged KV cache.
-
Parameters:
-
- kv_blocks (BufferValue)
- cache_lengths (TensorValue)
- lookup_table (TensorValue)
- max_lengths (TensorValue)
- kv_scales (BufferValue | None)
- dispatch_metadata (AttentionDispatchMetadata[TensorValue] | None)
cache_lengths
cache_lengths: TensorValue
dispatch_metadata
dispatch_metadata: AttentionDispatchMetadata[TensorValue] | None = None
kv_blocks
kv_blocks: BufferValue
kv_scales
kv_scales: BufferValue | None = None
lookup_table
lookup_table: TensorValue
max_lengths
max_lengths: TensorValue
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!