Skip to main content
Log in

Python module

attention_without_mask

An opaque KV Cache optimized vanilla attention mechanism, with Mask Variants provided inside the Kernel.

AttentionWithoutMask

class max.pipelines.nn.attention.attention_without_mask.AttentionWithoutMask(n_heads: 'int', kv_params: 'KVCacheParams', layer_idx: 'TensorValue', wqkv: 'TensorValue', wo: 'Linear', mask_variant: max.pipelines.nn.kernels.MHAMaskVariant)

mask_variant

mask_variant*: MHAMaskVariant*