IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python class

BatchCharacteristics

BatchCharacteristics

class max.nn.kv_cache.BatchCharacteristics(batch_size, max_prompt_length, max_cache_valid_length)

source

Bases: object

Upper-bound batch shape used to prepare decode attention metadata.

Captures the (batch_size, max_prompt_length, max_cache_valid_length) a decode forward should prepare its attention dispatch metadata for, which may exceed the batch’s real per-request values. PagedKVCacheManager.runtime_inputs() uses it to resolve the dispatch key once: e.g. for graph-capture replay, max_cache_valid_length is aligned up to a cache length recorded during capture and every data-parallel replica must run the identical captured graph. The batch’s real values must not exceed these.

Parameters:

  • batch_size (int)
  • max_prompt_length (int)
  • max_cache_valid_length (int)

batch_size

batch_size: int

source

max_cache_valid_length

max_cache_valid_length: int

source

max_prompt_length

max_prompt_length: int

source