Mojo struct

TMemTile

@register_passable(trivial) struct TMemTile[dtype_: DType, BM: Int, BN: Int]

Fields

tmem_addr (UInt32):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable, RegisterPassable, TrivialRegisterPassable

`comptime` members

`__copy_ctor_is_trivial`

comptime __copy_ctor_is_trivial = True

`delis_trivial`

comptime __del__is_trivial = True

`__move_ctor_is_trivial`

comptime __move_ctor_is_trivial = True

`dtype`

comptime dtype = dtype_

`dtype_size`

comptime dtype_size = size_of[TMemTile[dtype_, BM, BN].dtype]()

`num_m_tiles`

comptime num_m_tiles = (BM // 64)

Methods

`init`

__init__(tmem_addr: UInt32) -> Self

`getitem`

__getitem__(self, i: UInt32) -> Self

`offset`

offset[m_mma: Int, n_mma: Int](self) -> UInt32

Returns:

UInt32

`allocate_register_tile`

static allocate_register_tile[*, num_threads: Int]() -> LayoutTensor[TMemTile[dtype_, BM, BN].dtype, STMatrixLayout[BM, BN, num_threads=num_threads, accum_type_size=TMemTile[dtype_, BM, BN].dtype_size].vec_local_layout, MutAnyOrigin, address_space=AddressSpace.LOCAL, element_layout=STMatrixLayout[BM, BN, num_threads=num_threads, accum_type_size=TMemTile[dtype_, BM, BN].dtype_size].element_layout]

Returns:

LayoutTensor

`store_async`

store_async[*, num_threads: Int](self, src: LayoutTensor[TMemTile[dtype_, BM, BN].dtype, STMatrixLayout[BM, BN, num_threads=num_threads, accum_type_size=TMemTile[dtype_, BM, BN].dtype_size].vec_local_layout, MutAnyOrigin, address_space=AddressSpace.LOCAL, element_layout=STMatrixLayout[BM, BN, num_threads=num_threads, accum_type_size=TMemTile[dtype_, BM, BN].dtype_size].element_layout])

store_async[src_type: DType](self, src: TileTensor[src_type, Layout[ComptimeInt[BN], ComptimeInt[1]], MutExternalOrigin, address_space=AddressSpace.LOCAL])

`store`

store[*, num_threads: Int](self, src: LayoutTensor[TMemTile[dtype_, BM, BN].dtype, STMatrixLayout[BM, BN, num_threads=num_threads, accum_type_size=TMemTile[dtype_, BM, BN].dtype_size].vec_local_layout, MutAnyOrigin, address_space=AddressSpace.LOCAL, element_layout=STMatrixLayout[BM, BN, num_threads=num_threads, accum_type_size=TMemTile[dtype_, BM, BN].dtype_size].element_layout])

store[src_type: DType](self, src: TileTensor[src_type, Layout[ComptimeInt[BN], ComptimeInt[1]], MutExternalOrigin, address_space=AddressSpace.LOCAL])

`load_async_with_st_matrix_layout`

load_async_with_st_matrix_layout[*, num_threads: Int](self) -> LayoutTensor[TMemTile[dtype_, BM, BN].dtype, STMatrixLayout[BM, BN, num_threads=num_threads, accum_type_size=TMemTile[dtype_, BM, BN].dtype_size].vec_local_layout, MutAnyOrigin, address_space=AddressSpace.LOCAL, element_layout=STMatrixLayout[BM, BN, num_threads=num_threads, accum_type_size=TMemTile[dtype_, BM, BN].dtype_size].element_layout]

Returns:

LayoutTensor

`load_async`

load_async(self) -> TileTensor[TMemTile[dtype_, BM, BN].dtype, Layout[ComptimeInt[BN], ComptimeInt[1]], MutExternalOrigin, address_space=AddressSpace.LOCAL]

Returns:

TileTensor

Fields​

Implemented traits​

comptime members​

__copy_ctor_is_trivial​

__del__is_trivial​

__move_ctor_is_trivial​

dtype​

dtype_size​

num_m_tiles​

Methods​

__init__​

__getitem__​

offset​

allocate_register_tile​

store_async​

store​

load_async_with_st_matrix_layout​

load_async​

Fields

Implemented traits

`comptime` members

`__copy_ctor_is_trivial`

`delis_trivial`

`__move_ctor_is_trivial`

`dtype`

`dtype_size`

`num_m_tiles`

Methods

`init`

`getitem`

`offset`

`allocate_register_tile`

`store_async`

`store`

`load_async_with_st_matrix_layout`

`load_async`