Python module
max.kv_cache
Cache managerโ
DummyKVCache | No-op KV cache implementation for testing or when cache is disabled. |
|---|---|
InsufficientBlocksError | Exception raised when there are insufficient free blocks to satisfy an allocation. |
PagedKVCacheManager | Paged KVCache manager with data and tensor parallelism support. |
Transfer engineโ
KVTransferEngine | KVCache Transfer Engine with support for Data Parallelism (DP) and Tensor Parallelism (TP). |
|---|---|
KVTransferEngineMetadata | Metadata associated with a transfer engine. |
TransferReqData | Metadata associated with a transfer request. |
Factory functionsโ
available_port | Finds an available TCP port in the given range. |
|---|---|
load_kv_manager | Loads a single KV cache manager from the given params. |
load_multi_kv_managers | Loads a list of KV cache managers from the given params. |
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!