Skip to main content

Python class

MultiKVCacheParams

MultiKVCacheParams

class max.nn.kv_cache.MultiKVCacheParams(params, page_size, data_parallel_degree, n_devices, kv_connector, host_kvcache_swap_space_gb, num_eagle_speculative_tokens=0)

source

Bases: KVCacheParamInterface

Aggregates multiple KV cache parameter sets.

This class implements KVCacheParamInterface by aggregating multiple KVCacheParamInterface instances. Useful for models with multiple distinct KV caches (e.g., different cache configurations for different layers).

Parameters:

  • params (Sequence[KVCacheParams])
  • page_size (int)
  • data_parallel_degree (int)
  • n_devices (int)
  • kv_connector (KVConnectorType | None)
  • host_kvcache_swap_space_gb (float | None)
  • num_eagle_speculative_tokens (int)

bytes_per_block

property bytes_per_block: int

source

Total bytes per block across all KV caches.

Since all caches allocate memory for the same sequence, the total memory cost per block is the sum across all param sets.

data_parallel_degree

data_parallel_degree: int

source

from_params()

classmethod from_params(*params)

source

Creates a MultiKVCacheParams from one or more KVCacheParams.

Parameters:

params (KVCacheParams) – One or more KVCacheParams instances to aggregate. All params must share the same page_size, data_parallel_degree, n_devices, enable_kvcache_swapping_to_host, and host_kvcache_swap_space_gb values.

Returns:

A new MultiKVCacheParams aggregating all provided params.

Raises:

ValueError – If no params are provided.

Return type:

MultiKVCacheParams

get_symbolic_inputs()

get_symbolic_inputs(prefix='')

source

Returns the symbolic inputs for the KV cache.

Parameters:

prefix (str)

Return type:

KVCacheInputs[TensorType, BufferType]

host_kvcache_swap_space_gb

host_kvcache_swap_space_gb: float | None

source

kv_connector

kv_connector: KVConnectorType | None

source

n_devices

n_devices: int

source

num_eagle_speculative_tokens

num_eagle_speculative_tokens: int = 0

source

page_size

page_size: int

source

params

params: Sequence[KVCacheParams]

source

List of KV cache parameter sets to aggregate.