IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

MatmulConfig

struct MatmulConfig[a_type: DType, b_type: DType, c_type: DType, transpose_b: Bool = True]

Static configuration of GPU matmul.

Fields​

  • ​cta_group (Int):
  • ​mma_shape (IndexList[3]):
  • ​cluster_shape (IndexList[3]):
  • ​AB_swapped (Bool):
  • ​block_swizzle_size (Int):
  • ​raster_order (RasterOrder):
  • ​register_based_epilogue (Bool):
  • ​block_tile_shape (IndexList[3]):
  • ​num_split_k (Int):
  • ​num_pipeline_stages (Int):
  • ​num_clc_pipeline_stages (Int):
  • ​num_accum_pipeline_stages (Int):
  • ​num_output_stages (Int):
  • ​output_tile_shape (IndexList[2]):
  • ​a_swizzle (TensorMapSwizzle):
  • ​b_swizzle (TensorMapSwizzle):
  • ​c_swizzle (TensorMapSwizzle):
  • ​epi_load_swizzle (TensorMapSwizzle):
  • ​k_group_size (Int):
  • ​prefetch_tiles_n (Int):
  • ​gemm_kind (GEMMKind):
  • ​use_tma_epilogue_load (Bool):
  • ​num_tma_epilogue_pipeline_stages (Int):
  • ​epilogue_is_1d (Bool):

Implemented traits​

AnyType, Copyable, Equatable, Hashable, ImplicitlyCopyable, ImplicitlyDeletable, Movable, RegisterPassable, TrivialRegisterPassable, Writable

comptime members​

accum_type​

comptime accum_type = get_accum_type[a_type]()

Methods​

__init__​

def __init__(*, cta_group: Int = 2, mma_shape: IndexList[3] = get_mma_shape[a_type, MatmulConfig[a_type, b_type, c_type, transpose_b].accum_type](), cluster_shape: IndexList[3] = Index[Int, Int, Int](2, 1, 1), AB_swapped: Bool = False, num_split_k: Int = 1, block_swizzle_size: Int = 0, raster_order: RasterOrder = RasterOrder.AlongM, k_group_size: Int = 1, prefetch_tiles_n: Int = 0, num_pipeline_stages: Optional[Int] = None, num_accum_pipeline_stages: Int = 2, num_clc_pipeline_stages: Int = 2, register_based_epilogue: Bool = True, extra_smem_per_stage: Int = 0, gemm_kind: GEMMKind = GEMMKind.GEMM, use_tma_epilogue_load: Bool = False, num_tma_epilogue_pipeline_stages: Optional[Int] = None, epilogue_is_1d: Bool = False) -> Self

swap_AB_type​

def swap_AB_type(self) -> MatmulConfig[b_type, a_type, c_type, transpose_b]

Returns:

MatmulConfig[b_type, a_type, c_type, transpose_b]

write_to​

def write_to[W: Writer](self, mut writer: W)

write_repr_to​

def write_repr_to(self, mut writer: T)

get_kernel_name​

def get_kernel_name(self) -> String

Returns:

String