Skip to main content

Mojo struct

TileWriterThreadwise

struct TileWriterThreadwise[dtype: DType, dst_layout: Layout, dst_address_space: AddressSpace, dst_element_layout: Layout, dst_layout_int_type: DType, dst_linear_idx_type: DType, dst_masked: Bool, dst_alignment: Int, //, thread_layout: Layout, simd_size: Int, half_tile: Bool = False, swapAB: Bool = False]

Fields​

  • ​dst (TileWriterThreadwise[thread_layout, simd_size, half_tile, swapAB].DstType):
  • ​thread_idx (Int):

Implemented traits​

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable, RegisterPassable, SMemTileWriter, TrivialRegisterPassable

comptime members​

DstType​

comptime DstType = LayoutTensor[dtype, dst_layout, MutAnyOrigin, address_space=dst_address_space, element_layout=dst_element_layout, layout_int_type=dst_layout_int_type, linear_idx_type=dst_linear_idx_type, masked=dst_masked, alignment=dst_alignment]

Methods​

__init__​

__init__(dst: LayoutTensor[dtype, dst_layout, MutAnyOrigin, address_space=dst_address_space, element_layout=dst_element_layout, layout_int_type=dst_layout_int_type, linear_idx_type=dst_linear_idx_type, masked=dst_masked, alignment=dst_alignment], thread_idx: Int) -> Self

Initialize the threadwise tile writer.

Args:

write_tile​

write_tile(self, src: LayoutTensor[dtype, MutAnyOrigin, address_space=AddressSpace.SHARED, element_layout=src.element_layout, layout_int_type=src.layout_int_type, linear_idx_type=src.linear_idx_type, masked=src.masked, alignment=128], coords: Tuple[Int, Int])

Write a tile using thread-distributed stores.

Each thread writes a portion of the tile with proper swizzling for optimal memory access patterns.

Args: