Skip to main content

Mojo struct

MmaOpSM100_BlockScaled_SS

struct MmaOpSM100_BlockScaled_SS[c_type: DType, a_type: DType, b_type: DType, sfa_dtype: DType, sfb_dtype: DType, scaling_kind: UMMAKind, block_tile_shape: IndexList[3], mma_shape: IndexList[3], /, *, accum_type: DType = DType.float32, cta_group: Int = 1, cluster_shape: IndexList[3] = Index[Int, Int, Int](1, 1, 1), a_swizzle: TensorMapSwizzle = TensorMapSwizzle.SWIZZLE_128B, b_swizzle: TensorMapSwizzle = TensorMapSwizzle.SWIZZLE_128B, transpose_b: Bool = False, enable_small_sfb: Bool = False]

Fields​

  • ​idesc (UMMAInsDescriptor[scaling_kind]):
  • ​mask (UInt16):

Implemented traits​

AnyType, Copyable, Defaultable, ImplicitlyCopyable, ImplicitlyDestructible, Movable, RegisterPassable, TrivialRegisterPassable

Methods​

__init__​

__init__() -> Self

mma​

mma(self, a: TileTensor[address_space=AddressSpace.SHARED, linear_idx_type=a.linear_idx_type, element_size=a.element_size], b: TileTensor[address_space=AddressSpace.SHARED, linear_idx_type=b.linear_idx_type, element_size=b.element_size], sfa_smem: TileTensor[address_space=AddressSpace.SHARED, linear_idx_type=sfa_smem.linear_idx_type, element_size=sfa_smem.element_size], sfb_smem: TileTensor[address_space=AddressSpace.SHARED, linear_idx_type=sfb_smem.linear_idx_type, element_size=sfb_smem.element_size], c_tmem: UInt32, sfa_tmem: UInt32, sfb_tmem: UInt32, init_c: Bool, sfb_tmem_adj: UInt32 = UInt32(0))

TileTensor overload for block-scaled MMA input tiles.

Creates MMA descriptors directly from swizzle parameters.

commit​

commit(self, ptr_mbar: UnsafePointer[address_space=AddressSpace.SHARED])

wait​

wait(self)