Skip to main content

Mojo module

config

GFX950 attention config.

Supports both prefill (token_gen=False) and decode (token_gen=True).

Matches amd/mha.mojo config target: full_kv=True, depth_padded=False for both. Prefill: double_buffer=True. Decode: double_buffer=False, double_buffer_k_only when BN<=64, shared_kv only at depth>256 (SMEM budget).

Structs

Was this page helpful?