For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

MatmulConfig

struct MatmulConfig[a_type: DType, b_type: DType, c_type: DType, transpose_b: Bool = True]

Static configuration of GPU matmul.

Fields

cta_group (Int):
mma_shape (IndexList[3]):
cluster_shape (IndexList[3]):
AB_swapped (Bool):
block_swizzle_size (Int):
raster_order (RasterOrder):
register_based_epilogue (Bool):
block_tile_shape (IndexList[3]):
num_split_k (Int):
num_pipeline_stages (Int):
num_clc_pipeline_stages (Int):
num_accum_pipeline_stages (Int):
num_output_stages (Int):
output_tile_shape (IndexList[2]):
a_swizzle (TensorMapSwizzle):
b_swizzle (TensorMapSwizzle):
c_swizzle (TensorMapSwizzle):
epi_load_swizzle (TensorMapSwizzle):
k_group_size (Int):
prefetch_tiles_n (Int):
gemm_kind (GEMMKind):
use_tma_epilogue_load (Bool):
num_tma_epilogue_pipeline_stages (Int):
epilogue_is_1d (Bool):

Implemented traits

AnyType, Copyable, Equatable, Hashable, ImplicitlyCopyable, ImplicitlyDeletable, Movable, RegisterPassable, TrivialRegisterPassable, Writable

`comptime` members

`accum_type`

comptime accum_type = get_accum_type[a_type]()

Methods

`init`

def __init__(*, cta_group: Int = 2, mma_shape: IndexList[3] = get_mma_shape[a_type, MatmulConfig[a_type, b_type, c_type, transpose_b].accum_type](), cluster_shape: IndexList[3] = Index[Int, Int, Int](2, 1, 1), AB_swapped: Bool = False, num_split_k: Int = 1, block_swizzle_size: Int = 0, raster_order: RasterOrder = RasterOrder.AlongM, k_group_size: Int = 1, prefetch_tiles_n: Int = 0, num_pipeline_stages: Optional[Int] = None, num_accum_pipeline_stages: Int = 2, num_clc_pipeline_stages: Int = 2, register_based_epilogue: Bool = True, extra_smem_per_stage: Int = 0, gemm_kind: GEMMKind = GEMMKind.GEMM, use_tma_epilogue_load: Bool = False, num_tma_epilogue_pipeline_stages: Optional[Int] = None, epilogue_is_1d: Bool = False) -> Self

`swap_AB_type`

def swap_AB_type(self) -> MatmulConfig[b_type, a_type, c_type, transpose_b]

Returns:

MatmulConfig[b_type, a_type, c_type, transpose_b]

`write_to`

def write_to[W: Writer](self, mut writer: W)

`write_repr_to`

def write_repr_to(self, mut writer: T)

`get_kernel_name`

def get_kernel_name(self) -> String

Returns:

String

Fields​

Implemented traits​

comptime members​

accum_type​

Methods​

__init__​

swap_AB_type​

write_to​

write_repr_to​

get_kernel_name​