Python class
MultiKVCacheParams
MultiKVCacheParams
class max.nn.kv_cache.MultiKVCacheParams(params, page_size, data_parallel_degree, n_devices, kv_connector, host_kvcache_swap_space_gb, num_eagle_speculative_tokens=0)
Bases: KVCacheParamInterface
Aggregates multiple KV cache parameter sets.
This class implements KVCacheParamInterface by aggregating multiple KVCacheParamInterface instances. Useful for models with multiple distinct KV caches (e.g., different cache configurations for different layers).
-
Parameters:
bytes_per_block
property bytes_per_block: int
Total bytes per block across all KV caches.
Since all caches allocate memory for the same sequence, the total memory cost per block is the sum across all param sets.
data_parallel_degree
data_parallel_degree: int
from_params()
classmethod from_params(*params)
Creates a MultiKVCacheParams from one or more KVCacheParams.
-
Parameters:
-
params (KVCacheParams) – One or more
KVCacheParamsinstances to aggregate. All params must share the samepage_size,data_parallel_degree,n_devices,enable_kvcache_swapping_to_host, andhost_kvcache_swap_space_gbvalues. -
Returns:
-
A new
MultiKVCacheParamsaggregating all provided params. -
Raises:
-
ValueError – If no params are provided.
-
Return type:
get_symbolic_inputs()
get_symbolic_inputs(prefix='')
Returns the symbolic inputs for the KV cache.
-
Parameters:
-
prefix (str)
-
Return type:
host_kvcache_swap_space_gb
kv_connector
kv_connector: KVConnectorType | None
n_devices
n_devices: int
num_eagle_speculative_tokens
num_eagle_speculative_tokens: int = 0
page_size
page_size: int
params
params: Sequence[KVCacheParams]
List of KV cache parameter sets to aggregate.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!