Python class

AttentionDispatchMetadata

`AttentionDispatchMetadata`

class max.nn.kv_cache.AttentionDispatchMetadata(tensor)

source

Bases: NestedIterableDataclass[_DispatchMetadataT], Generic[_DispatchMetadataT]

Wraps the scalar attention dispatch metadata tensor for a single device.

The wrapped tensor must have dtype int64 and rank 1. It encodes the four dispatch scalars consumed by ragged decode kernels: batch size, maximum query sequence length, number of partitions, and maximum cache valid length.

Parameters:: tensor (_DispatchMetadataT)

`tensor`

tensor: _DispatchMetadataT

source

AttentionDispatchMetadata​

tensor​

`AttentionDispatchMetadata`

`tensor`