Skip to main content

Python class

IncrementCacheLengthsProcessor

IncrementCacheLengthsProcessor

class max.pipelines.kv_cache.IncrementCacheLengthsProcessor(session, params)

source

Bases: object

Processes KV cache length increments after each decoding step.

Parameters:

execute()

execute(kv_cache_inputs, prev_model_inputs)

source

Runs the increment cache lengths graph and returns updated cache inputs.

Parameters:

Return type:

KVCacheInputs[Buffer, Buffer]