Skip to main content
Log in

Python module

cache_params

KVCacheParams

class max.pipelines.kv_cache.cache_params.KVCacheParams(dtype: DType, n_kv_heads: int, head_dim: int, enable_prefix_caching: bool = False, cache_strategy: KVCacheStrategy = KVCacheStrategy.CONTINUOUS, n_devices: int = 1)

dtype_shorthand

property dtype_shorthand*: str*

The textual representation in shorthand of the dtype.

static_cache_shape

property static_cache_shape*: tuple[str, str, str, str, str]*

KVCacheStrategy

class max.pipelines.kv_cache.cache_params.KVCacheStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

CONTINUOUS

CONTINUOUS = 'continuous'

NAIVE

NAIVE = 'naive'

PAGED

PAGED = 'paged'

kernel_substring()

kernel_substring() → str

Returns the common substring that we include in the kernel name for this caching strategy.

uses_opaque()

uses_opaque() → bool