Python class
KVCacheInputsPerDevice
KVCacheInputsPerDevice
class max.nn.kv_cache.KVCacheInputsPerDevice(blocks, cache_lengths, lookup_table, max_lengths, kv_scales=None, attention_dispatch_metadata=None)
Bases: object
Holds the concrete KV cache buffer inputs for a single device.
-
Parameters:
as_list()
as_list()
Returns the non-None KV cache buffers in ABI order.
attention_dispatch_metadata
blocks
blocks: Buffer
cache_lengths
cache_lengths: Buffer
kv_scales
lookup_table
lookup_table: Buffer
max_lengths
max_lengths: Buffer
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!