IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo module

config

GFX950 attention config.

Supports both prefill (token_gen=False) and decode (token_gen=True).

Matches amd/mha.mojo config target: full_kv=True, depth_padded=False for both. Prefill: double_buffer=True. Decode: double_buffer=False, double_buffer_k_only when BN<=64, shared_kv only at depth>256 (SMEM budget).

Structs