Mojo struct

MatmulConfig

@register_passable(trivial) struct MatmulConfig[a_type: DType, b_type: DType, c_type: DType, transpose_b: Bool = True]

Static configuration of SM90 GPU matmul.

Fields

block_tile_shape (IndexList[3]):
mma_shape (IndexList[3]):
cluster_shape (IndexList[3]):
num_pipeline_stages (UInt):
num_k_partitions (UInt):
num_consumer (UInt):
partitioned_multicast (Bool):
k_group_size (UInt):

Implemented traits

AnyType, Copyable, Equatable, Hashable, ImplicitlyCopyable, ImplicitlyDestructible, Movable, RegisterPassable, Stringable, TrivialRegisterPassable, Writable

`comptime` members

`__copy_ctor_is_trivial`

comptime __copy_ctor_is_trivial = True

`delis_trivial`

comptime __del__is_trivial = True

`__move_ctor_is_trivial`

comptime __move_ctor_is_trivial = True

Methods

`init`

__init__(block_tile_shape: IndexList[3], mma_shape: IndexList[3], cluster_shape: IndexList[3], num_pipeline_stages: Scalar[DType.uint], num_k_partitions: Scalar[DType.uint], num_consumer: Scalar[DType.uint], partitioned_multicast: Bool, pdl_level: PDLLevel, k_group_size: Scalar[DType.uint]) -> Self

Initialize MatmulConfig with explicit values for all fields.

__init__(m: Int, n: Int, k: Int, num_k_partitions: Scalar[DType.uint] = 1, partitioned_multicast: Bool = False, pdl_level: PDLLevel = PDLLevel.OFF, k_groups: Optional[UInt] = None, consumer_groups: Optional[Int] = None, swapAB: Bool = False) -> Self

Initialize MatmulConfig by computing optimal values from M, N, K.

Args:

m (Int): The M dimension of the matmul.
n (Int): The N dimension of the matmul.
k (Int): The K dimension of the matmul.
num_k_partitions (Scalar): Number of K partitions.
partitioned_multicast (Bool): Whether to use partitioned multicast.
pdl_level (PDLLevel): PDL level for grid controls.
k_groups (Optional): How many pipeline (loads and stores) are grouped together.
consumer_groups (Optional): The number of consumer groups.
swapAB (Bool): Whether to swap A and B.

`eq`

__eq__(self, other: Self) -> Bool

Returns:

Bool

`adjust_kgroup_size`

static adjust_kgroup_size(mma_m: Scalar[DType.uint], mma_n: Scalar[DType.uint], K: Scalar[DType.uint], BK: Scalar[DType.uint], num_pipeline_stages: Scalar[DType.uint]) -> UInt

Returns:

UInt

`pdl_level`

pdl_level(self) -> PDLLevel

Returns:

PDLLevel

`to_base_config`

to_base_config(self) -> MatmulConfig[a_type, b_type, c_type, transpose_b]

Convert to base MatmulConfig from utils_gpu.

Returns:

MatmulConfig

`str`

__str__(self) -> String

Returns:

String

`write_to`

write_to(self, mut writer: T)

`repr`

__repr__(self) -> String

Returns:

String

`hash`

__hash__[H: Hasher](self, mut hasher: H)

Updates hasher with the underlying bytes.

Parameters:

H (Hasher): The hasher type.

Args:

hasher (H): The hasher instance.

Fields​

Implemented traits​

comptime members​

__copy_ctor_is_trivial​

__del__is_trivial​

__move_ctor_is_trivial​

Methods​

__init__​

__eq__​

adjust_kgroup_size​

pdl_level​

to_base_config​

__str__​

write_to​

__repr__​

__hash__​

Fields

Implemented traits

`comptime` members

`__copy_ctor_is_trivial`

`delis_trivial`

`__move_ctor_is_trivial`

Methods

`init`

`eq`

`adjust_kgroup_size`

`pdl_level`

`to_base_config`

`str`

`write_to`

`repr`

`hash`