Skip to main content

Python module

max.kv_cache

Cache managerโ€‹

DummyKVCacheNo-op KV cache implementation for testing or when cache is disabled.
InsufficientBlocksErrorException raised when there are insufficient free blocks to satisfy an allocation.
PagedKVCacheManagerPaged KVCache manager with data and tensor parallelism support.

Transfer engineโ€‹

KVTransferEngineKVCache Transfer Engine with support for Data Parallelism (DP) and Tensor Parallelism (TP).
KVTransferEngineMetadataMetadata associated with a transfer engine.
TransferReqDataMetadata associated with a transfer request.

Factory functionsโ€‹

available_portFinds an available TCP port in the given range.
load_kv_managerLoads a single KV cache manager from the given params.
load_multi_kv_managersLoads a list of KV cache managers from the given params.