Skip to main content

Mojo function

create_tma_tile_template

create_tma_tile_template[type: DType, rank: Int, tile_shape: IndexList[rank], /, is_k_major: Bool = True, swizzle_mode: TensorMapSwizzle = TensorMapSwizzle(0), *, __tile_layout: Layout = row_major(tile_shape.__getitem__[::Indexer](0), tile_shape.__getitem__[::Indexer](1)), __desc_layout: Layout = _tma_desc_tile_layout[::DType,::Int,::IndexList[$1, ::DType()]() -> TMATensorTile[type, __tile_layout, __desc_layout]

Same as create_tma_tile expect the descriptor is only a placeholder or a template for later replacement.

specification of data type, rank, and layout orientation. It supports both 2D and 3D tensors and provides fine-grained control over the memory access patterns.

Constraints:

  • Only supports 2D and 3D tensors (rank must be 2 or 3).
  • For non-SWIZZLE_NONE modes, the K dimension size in bytes must be a multiple of the swizzle mode's byte size.
  • For MN-major layout, only SWIZZLE_128B is supported.
  • For 3D tensors, only K-major layout is supported.

Parameters:

  • type (DType): DType The data type of the tensor elements.
  • rank (Int): Int The dimensionality of the tensor (must be 2 or 3).
  • tile_shape (IndexList[rank]): IndexList[rank] The shape of the tile to be transferred.
  • is_k_major (Bool): Bool = True Whether the tensor layout is K-major (True) or MN-major (False). K-major is typically used for weight matrices, while MN-major is used for activation matrices in matrix multiplication operations.
  • swizzle_mode (TensorMapSwizzle): TensorMapSwizzle = TensorMapSwizzle.SWIZZLE_NONE The swizzling mode to use for memory access optimization.
  • __tile_layout (Layout): Layout = Layout.row_major(tile_shape[0], tile_shape[1]) Internal parameter for the tile layout in shared memory.
  • __desc_layout (Layout): Layout = _tma_desc_tile_layout[...] Internal parameter for the descriptor layout, which may differ from the tile layout to accommodate hardware requirements.

Returns:

A TMATensorTile configured with the specified parameters, ready for use in asynchronous data transfer operations.

Was this page helpful?