Python class
KVTransferEngineMetadata
KVTransferEngineMetadata
class max.kv_cache.KVTransferEngineMetadata(*, name, total_num_pages, bytes_per_page, memory_type, hostname, agents_meta, replicate_kv_across_tp=False)
Bases: Struct
Metadata associated with a transfer engine.
This is safe to send between threads/processes.
-
Parameters:
agents_meta
[replica][tp_shard].
-
Type:
-
Metadata for each replica’s agents
bytes_per_page
bytes_per_page: int
Bytes per page for each tensor.
hostname
hostname: str
Hostname of the machine that the transfer engine is running on.
memory_type
memory_type: MemoryType
Memory type of the transfer engine.
name
name: str
Base name of the transfer engine.
replicate_kv_across_tp
replicate_kv_across_tp: bool
True iff KV buffers are identical across TP ranks (e.g. MLA with
num_kv_heads=1). When both sides declare different (dp, tp) but one
replicates, the engine can reinterpret the replicating side as
[dp*tp][1] to let a prefill worker at (DP=m, TP=n) connect to a
decode worker at (DP=m*n, TP=1).
total_num_pages
total_num_pages: int
Total number of pages in each tensor.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!