Skip to main content
Log in

Python module

radix_trie

RadixTrie

class max.pipelines.kv_cache.radix_trie.RadixTrie

evict_blocks()

evict_blocks(desired_num_evicted: int) → List[Any]

Attempt to evict at most desired_num_evicted blocks from trie.

insert()

insert(tokens: List[Any], blocks: List[Any], node: TrieNode | None = None) → TrieNode

Inserts tokens and blocks into the trie.

We assume that each block contains exactly one token so the length of both input lists must match.

  • Parameters:

    • tokens – Tokens to insert into trie
    • blocks – KV cache block for each token
    • node – Node to begin insertion at. By default, insertion begins at root node
  • Returns:

    Node corresponding to end of the sequence where future : generated tokens can be inserted

  • Return type:

    trie_node

mark_in_use_by()

mark_in_use_by(node: TrieNode, seq_id: int)

Climb up the trie starting from node, marking each node as being in use by this seq.

mark_not_in_use_by()

mark_not_in_use_by(node: TrieNode, seq_id: int)

Climb up the trie starting from node, marking each node as no longer in use by this seq. Since nodes without any users may be eligible for eviction, we also update its last_access_time.

match_prefix()

match_prefix(tokens: List[Any]) → Tuple[TrieNode, List[Any]]

Matches the input tokens with the contents of the trie.

  • Parameters:

    tokens – tokens to search the trie for

  • Returns:

    • trie_node: Node corresponding to end of matched prefix where : future generated tokens can be inserted
    • block_list: KV cache blocks for matched prefix
  • Return type:

    Tuple containing

pretty_format()

pretty_format() → List[str]

Formats the contents of the trie.

TrieNode

class max.pipelines.kv_cache.radix_trie.TrieNode