Skip to main content

Mojo module

iglp

IGroupLP sched_group_barrier aggregate-pair helpers for AMD MHA.

Comptime-recursive expansions of HipKittens' sched_barrier_pairs<Pairs, VALU_CNT, Group> and sched_barrier_exp_pairs<...> C++ templates (see ~/HipKittens/kernels/attn/gqa_causal/kernel.cpp:44-56).

These helpers prescribe IGroupLP groupings to LLVM's AMDGPU instruction scheduler via the llvm.amdgcn.sched.group.barrier intrinsic. They were defined identically across 5 attention kernels (hk_mha, hk_mha_hk_exact, hk_mha_prefill, hk_mha_hk_exact_v3, aiter_mha); pulled here to a shared module to (a) give one place to fix when the language evolves, (b) reduce duplication.

Per-kernel hint-pair parameters (which N, M for QK / PV / EXP cluster types) are tuned via parameter sweep at the kernel β€” see [[patterns/amd-iglp-hint-pair-sweep]]. Only the helper expansion logic is shared; the per-cluster (N, M) defaults belong with each kernel (they're shape-dependent and kernel-specific).

For size semantics, sync_id ordering, and why these intrinsics leave no asm trace, see [[patterns/amd-iglp-instruction-group-interleave-pattern]].

Structs​

Functions​