Mojo struct
ScatterGatherAmd
struct ScatterGatherAmd[thread_layout: Layout, num_threads: Int = thread_layout.size(), thread_scope: ThreadScope = ThreadScope.BLOCK, block_dim_count: Int = 1]
AMD tile-based scatter-gather for DRAM-register data movement.
Parametersβ
- βthread_layout (
Layout): Thread organization layout. - βnum_threads (
Int): Total threads (defaults to thread_layout size). - βthread_scope (
ThreadScope): Thread execution scope (block or warp). - βblock_dim_count (
Int): Number of block dimensions.
Fieldsβ
- βbuffer (
AMDBufferResource):
Implemented traitsβ
AnyType,
ImplicitlyDestructible
Methodsβ
__init__β
__init__(out self, tensor: LayoutTensor[address_space=tensor.address_space, element_layout=tensor.element_layout, layout_int_type=tensor.layout_int_type, linear_idx_type=tensor.linear_idx_type, masked=tensor.masked, alignment=tensor.alignment])
Initialize with a tensor.
Args:
- βtensor (
LayoutTensor[address_space=tensor.address_space, element_layout=tensor.element_layout, layout_int_type=tensor.layout_int_type, linear_idx_type=tensor.linear_idx_type, masked=tensor.masked, alignment=tensor.alignment]): Layout tensor for AMD buffer resource creation.
copyβ
copy(self, dst_reg_tile: LayoutTensor[address_space=AddressSpace.LOCAL, element_layout=dst_reg_tile.element_layout, layout_int_type=dst_reg_tile.layout_int_type, linear_idx_type=dst_reg_tile.linear_idx_type, masked=dst_reg_tile.masked, alignment=dst_reg_tile.alignment], src_gmem_tile: LayoutTensor[address_space=src_gmem_tile.address_space, element_layout=src_gmem_tile.element_layout, layout_int_type=src_gmem_tile.layout_int_type, linear_idx_type=src_gmem_tile.linear_idx_type, masked=src_gmem_tile.masked, alignment=src_gmem_tile.alignment], offset: Optional[Int] = None)
Copy DRAM to registers.
Args:
- βdst_reg_tile (
LayoutTensor[address_space=AddressSpace.LOCAL, element_layout=dst_reg_tile.element_layout, layout_int_type=dst_reg_tile.layout_int_type, linear_idx_type=dst_reg_tile.linear_idx_type, masked=dst_reg_tile.masked, alignment=dst_reg_tile.alignment]): Destination register tile. - βsrc_gmem_tile (
LayoutTensor[address_space=src_gmem_tile.address_space, element_layout=src_gmem_tile.element_layout, layout_int_type=src_gmem_tile.layout_int_type, linear_idx_type=src_gmem_tile.linear_idx_type, masked=src_gmem_tile.masked, alignment=src_gmem_tile.alignment]): Source global memory tile. - βoffset (
Optional[Int]): Optional copy offset.
copy(self, dst_gmem_tile: LayoutTensor[address_space=dst_gmem_tile.address_space, element_layout=dst_gmem_tile.element_layout, layout_int_type=dst_gmem_tile.layout_int_type, linear_idx_type=dst_gmem_tile.linear_idx_type, masked=dst_gmem_tile.masked, alignment=dst_gmem_tile.alignment], src_reg_tile: LayoutTensor[address_space=AddressSpace.LOCAL, element_layout=src_reg_tile.element_layout, layout_int_type=src_reg_tile.layout_int_type, linear_idx_type=src_reg_tile.linear_idx_type, masked=src_reg_tile.masked, alignment=src_reg_tile.alignment])
Copy registers to DRAM.
Args:
- βdst_gmem_tile (
LayoutTensor[address_space=dst_gmem_tile.address_space, element_layout=dst_gmem_tile.element_layout, layout_int_type=dst_gmem_tile.layout_int_type, linear_idx_type=dst_gmem_tile.linear_idx_type, masked=dst_gmem_tile.masked, alignment=dst_gmem_tile.alignment]): Destination global memory tile. - βsrc_reg_tile (
LayoutTensor[address_space=AddressSpace.LOCAL, element_layout=src_reg_tile.element_layout, layout_int_type=src_reg_tile.layout_int_type, linear_idx_type=src_reg_tile.linear_idx_type, masked=src_reg_tile.masked, alignment=src_reg_tile.alignment]): Source register tile.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!