Mojo struct
TileWriterThreadwise
@register_passable(trivial)
struct TileWriterThreadwise[dtype: DType, dst_layout: Layout, dst_address_space: AddressSpace, dst_element_layout: Layout, dst_layout_int_type: DType, dst_linear_idx_type: DType, dst_masked: Bool, dst_alignment: Int, //, thread_layout: Layout, simd_size: Int, half_tile: Bool = False]
Fields
- dst (
LayoutTensor[dtype, dst_layout, MutableAnyOrigin, address_space=dst_address_space, element_layout=dst_element_layout, layout_int_type=dst_layout_int_type, linear_idx_type=dst_linear_idx_type, masked=dst_masked, alignment=dst_alignment]): - thread_idx (
UInt):
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
Movable,
SMemTileWriter,
UnknownDestructibility
Aliases
__copyinit__is_trivial
alias __copyinit__is_trivial = True
__del__is_trivial
alias __del__is_trivial = True
__moveinit__is_trivial
alias __moveinit__is_trivial = True
DstType
alias DstType = LayoutTensor[dtype, dst_layout, MutableAnyOrigin, address_space=dst_address_space, element_layout=dst_element_layout, layout_int_type=dst_layout_int_type, linear_idx_type=dst_linear_idx_type, masked=dst_masked, alignment=dst_alignment]
Methods
__init__
__init__(dst: LayoutTensor[dtype, dst_layout, MutableAnyOrigin, address_space=dst_address_space, element_layout=dst_element_layout, layout_int_type=dst_layout_int_type, linear_idx_type=dst_linear_idx_type, masked=dst_masked, alignment=dst_alignment], thread_idx: UInt) -> Self
Initialize the threadwise tile writer.
Args:
- dst (
LayoutTensor): Destination tensor in global memory. - thread_idx (
UInt): Thread index within the consumer warp group.
write_tile
write_tile(self, src: LayoutTensor[dtype, layout, MutableAnyOrigin, address_space=AddressSpace(3), element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=128], coords: Tuple[UInt, UInt])
Write a tile using thread-distributed stores.
Each thread writes a portion of the tile with proper swizzling for optimal memory access patterns.
Args:
- src (
LayoutTensor): Source tile in shared memory. - coords (
Tuple): Tile indices (row_tile, col_tile) in the destination matrix.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!