Skip to main content

Mojo module

hk_mha_mma_op

MHA MMA operator: shape constants, SMEM→register loaders, and MFMA dispatch used by HKMhaPrefill.

Specialized for v_mfma_f32_32x32x16_bf16 (BF16-input, FP32-accum) on gfx950. K side uses the two-XOR worker-index swizzle and ds_read_b128; V side uses the identity swizzle and the ds_read_tr16_b64_warp transpose-load.

comptime values

ACC_ROW_OFFSETS_32x32

comptime ACC_ROW_OFFSETS_32x32 = SIMD(Int32(0), Int32(1), Int32(2), Int32(3), Int32(8), Int32(9), Int32(10), Int32(11), Int32(16), Int32(17), Int32(18), Int32(19), Int32(24), Int32(25), Int32(26), Int32(27), __list_literal__=NoneType(None))

Structs

  • HKMhaConfig: Shape configuration for HKMhaPrefill.
  • MhaMmaOp: Namespace-style struct holding the shape constants, register-tile layouts, and SMEM→register loaders for HKMhaPrefill. All call sites go through static methods on this struct.