For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
TileWriterTMA
struct TileWriterTMA[tma_origin: ImmutOrigin, dtype: DType, tma_rank: Int, tile_shape: IndexList[tma_rank], desc_shape: IndexList[tma_rank], //]
TMA-based tile writer for hardware-accelerated memory transfers.
This writer uses NVIDIA's Tensor Memory Accelerator (TMA) for efficient 2D tile transfers from shared to global memory.
Parametersβ
- βtma_origin (
ImmutOrigin): Origin type for the TMA operation. - βdtype (
DType): Data type of the elements being written. - βtma_rank (
Int): Rank of the TMA tile (number of dimensions). - βtile_shape (
IndexList[tma_rank]): Shape of the TMA tile for async store operations. - βdesc_shape (
IndexList[tma_rank]): Shape described by the TMA descriptor.
Fieldsβ
- βtma_op (
TileWriterTMA.TMATensorTilePtr):
Implemented traitsβ
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDeletable,
Movable,
RegisterPassable,
SMemTileWriter,
TrivialRegisterPassable
comptime membersβ
TMATensorTilePtrβ
comptime TMATensorTilePtr = Pointer[TMATensorTile[dtype, tma_rank, tile_shape, desc_shape], tma_origin]
Methodsβ
__init__β
def __init__(tma_op: Pointer[TMATensorTile[dtype, tma_rank, tile_shape, desc_shape], tma_origin]) -> Self
Initialize the TMA tile writer.
Args:
- βtma_op (
Pointer[TMATensorTile[dtype, tma_rank, tile_shape, desc_shape], tma_origin]): Pointer to the TMA tensor descriptor.
write_tileβ
def write_tile(self, src: TileTensor[dtype, address_space=AddressSpace.SHARED, linear_idx_type=src.linear_idx_type, element_size=src.element_size], coords: Tuple[Int, Int])
Write a tile using TMA hardware acceleration.
Performs an asynchronous TMA store from shared memory to global memory. The operation includes proper fencing and synchronization.
Note: Coordinates are expected in (N, M) order for column-major output.
Args:
- βsrc (
TileTensor[dtype, address_space=AddressSpace.SHARED, linear_idx_type=src.linear_idx_type, element_size=src.element_size]): Source tile in shared memory. - βcoords (
Tuple[Int, Int]): Tile coordinates (col, row) in element space.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!