Mojo module
iglp
IGroupLP sched_group_barrier aggregate-pair helpers for AMD MHA.
Comptime-recursive expansions of HipKittens' sched_barrier_pairs<Pairs, VALU_CNT, Group> and sched_barrier_exp_pairs<...> C++ templates
(see ~/HipKittens/kernels/attn/gqa_causal/kernel.cpp:44-56).
These helpers prescribe IGroupLP groupings to LLVM's AMDGPU instruction
scheduler via the llvm.amdgcn.sched.group.barrier intrinsic. They were
defined identically across 5 attention kernels (hk_mha, hk_mha_hk_exact,
hk_mha_prefill, hk_mha_hk_exact_v3, aiter_mha); pulled here to a
shared module to (a) give one place to fix when the language evolves,
(b) reduce duplication.
Per-kernel hint-pair parameters (which N, M for QK / PV / EXP cluster
types) are tuned via parameter sweep at the kernel β see
[[patterns/amd-iglp-hint-pair-sweep]]. Only the helper expansion logic
is shared; the per-cluster (N, M) defaults belong with each kernel
(they're shape-dependent and kernel-specific).
For size semantics, sync_id ordering, and why these intrinsics
leave no asm trace, see
[[patterns/amd-iglp-instruction-group-interleave-pattern]].
Structsβ
- β
AMDIGLPStrategy: Preset strategy values for thellvm.amdgcn.iglp.optintrinsic.
Functionsβ
- β
sched_barrier_exp_pairs: Emitspairsschedule groups of shape[1 MFMA, exp_cnt TRANS]. - β
sched_barrier_pairs: Emitspairsschedule groups of shape[1 MFMA, valu_cnt VALU]. - β
sched_dsread_valu_pairs: Emitspairsschedule groups of shape[1 DS_READ, valu_cnt VALU].
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!