Mojo struct
IteratorScatterGatherAmd
struct IteratorScatterGatherAmd[thread_layout: Layout, num_threads: Int = thread_layout.size(), thread_scope: ThreadScope = ThreadScope(0), block_dim_count: Int = 1]
Iterator-based AMD scatter-gather for DRAM-register data movement.
Parameters
- thread_layout (
Layout): Thread organization layout. - num_threads (
Int): Total threads (defaults to thread_layout size). - thread_scope (
ThreadScope): Thread execution scope (block or warp). - block_dim_count (
Int): Number of block dimensions.
Fields
- buffer (
AMDBufferResource):
Implemented traits
AnyType,
UnknownDestructibility
Aliases
__del__is_trivial
alias __del__is_trivial = True
Methods
__init__
__init__(out self, tensor: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], tensor_iter: LayoutTensorIter[dtype, layout, origin, address_space=address_space, alignment=alignment, circular=circular, axis=axis, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked])
Initialize with tensor and iterator.
Args:
- tensor (
LayoutTensor): Layout tensor for bounds. - tensor_iter (
LayoutTensorIter): Iterator for AMD buffer resource.
copy
copy(self, dst_reg_tile: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], src_gmem_tile_iter: LayoutTensorIter[dtype, layout, origin, address_space=address_space, alignment=alignment, circular=circular, axis=axis, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked])
Copy DRAM to registers via iterator.
Args:
- dst_reg_tile (
LayoutTensor): Destination register tile. - src_gmem_tile_iter (
LayoutTensorIter): Source memory iterator.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!