Skip to main content

Python function

estimated_memory_size

estimated_memory_size()​

max.nn.kv_cache.estimated_memory_size(params, available_cache_memory, max_batch_size, max_seq_len)

source

Computes the estimated memory size of the KV cache used by all replicas.

Parameters:

  • available_cache_memory (int) – The amount of cache memory available across all devices.
  • max_batch_size (int) – The maximum batch size.
  • max_seq_len (int) – The maximum sequence length.
  • params (KVCacheParamInterface)

Returns:

The estimated memory usage of the KV cache in bytes.

Return type:

int