Skip to main content

Python class

PipelineModelWithKVCache

PipelineModelWithKVCacheโ€‹

class max.pipelines.lib.PipelineModelWithKVCache(pipeline_config, session, devices, kv_cache_config, weights, adapter, return_logits, return_hidden_states=ReturnHiddenStates.NONE)

source

Bases: PipelineModel[BaseContextType]

A pipeline model that supports KV cache.

Parameters:

get_kv_params()โ€‹

abstract classmethod get_kv_params(huggingface_config, pipeline_config, devices, kv_cache_config, cache_dtype)

source

Returns the KV cache params for the pipeline model.

Parameters:

Return type:

KVCacheParamInterface

kv_paramsโ€‹

kv_params: KVCacheParamInterface

source