Python function
estimated_memory_size
estimated_memory_size()β
max.nn.kv_cache.estimated_memory_size(params, available_cache_memory, max_batch_size, max_seq_len)
Computes the estimated memory size of the KV cache used by all replicas.
-
Parameters:
-
- available_cache_memory (int) β The amount of cache memory available across all devices.
- max_batch_size (int) β The maximum batch size.
- max_seq_len (int) β The maximum sequence length.
- params (KVCacheParamInterface)
-
Returns:
-
The estimated memory usage of the KV cache in bytes.
-
Return type:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!