For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
KernelGeometry
struct KernelGeometry
Bundles kernel-shape inputs and the derived scheduling counts.
Fields:
BM, BN, BK: Block shape (M-tile, N-tile, K-tile per workgroup).
MMA_M, MMA_N, MMA_K: MFMA op shape.
elem_bytes: Element size in bytes (1 for FP8, 2 for BF16/FP16, 4 for FP32).
simd_width: SIMD load width in elements (typically simd_width_of[in_type]()).
is_fp8: True iff elem_bytes == 1.
vm_per_load_a, vm_per_load_b: vmcnt entries per A/B prefetch.
lgkm_per_load_a, lgkm_per_load_b: lgkmcnt entries per A/B frag-load.
Fieldsβ
- βBM (
Int): Block shape M (rows per workgroup tile). - βBN (
Int): Block shape N (columns per workgroup tile). - βBK (
Int): Block shape K (reduction dimension per workgroup tile). - βMMA_M (
Int): MFMA op shape M. - βMMA_N (
Int): MFMA op shape N. - βMMA_K (
Int): MFMA op shape K. - βelem_bytes (
Int): Element size in bytes (1 for FP8, 2 for BF16/FP16, 4 for FP32). - βsimd_width (
Int): SIMD load width in elements (typicallysimd_width_of[in_type]()). - βis_fp8 (
Bool): True iffelem_bytes == 1(FP8 dtypes). - βvm_per_load_a (
Int): Number ofvmcntentries consumed per channel-A prefetch. - βvm_per_load_b (
Int): Number ofvmcntentries consumed per channel-B prefetch. - βlgkm_per_load_a (
Int): Number oflgkmcntentries consumed per channel-A fragment load. - βlgkm_per_load_b (
Int): Number oflgkmcntentries consumed per channel-B fragment load.
Implemented traitsβ
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!