Skip to main content

Python function

compute_max_seq_len_fitting_in_cache

compute_max_seq_len_fitting_in_cache()

max.nn.kv_cache.compute_max_seq_len_fitting_in_cache(params, available_cache_memory)

source

Computes the maximum sequence length that can fit in the available memory.

Parameters:

Returns:

The maximum sequence length that can fit in the available cache memory.

Return type:

int