Skip to main content

Mojo struct

ScatterGatherAmd

struct ScatterGatherAmd[thread_layout: Layout, num_threads: Int = thread_layout.size(), thread_scope: ThreadScope = ThreadScope(0), block_dim_count: Int = 1]

AMD tile-based scatter-gather for DRAM-register data movement.

Parameters

  • thread_layout (Layout): Thread organization layout.
  • num_threads (Int): Total threads (defaults to thread_layout size).
  • thread_scope (ThreadScope): Thread execution scope (block or warp).
  • block_dim_count (Int): Number of block dimensions.

Fields

  • buffer (AMDBufferResource):

Implemented traits

AnyType, UnknownDestructibility

Aliases

__del__is_trivial

alias __del__is_trivial = True

Methods

__init__

__init__(out self, tensor: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment])

Initialize with a tensor.

Args:

  • tensor (LayoutTensor): Layout tensor for AMD buffer resource creation.

copy

copy(self, dst_reg_tile: LayoutTensor[dtype, layout, origin, address_space=AddressSpace(5), element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], src_gmem_tile: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], src_tensor: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], offset: OptionalReg[UInt] = None)

Copy DRAM to registers.

Args:

copy(self, dst_gmem_tile: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], src_reg_tile: LayoutTensor[dtype, layout, origin, address_space=AddressSpace(5), element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment])

Copy registers to DRAM.

Args:

  • dst_gmem_tile (LayoutTensor): Destination global memory tile.
  • src_reg_tile (LayoutTensor): Source register tile.

Was this page helpful?