Mojo struct
SharedToLocalTileCopier
struct SharedToLocalTileCopier[thread_layout: Layout[thread_layout.shape_types, thread_layout.stride_types], *, thread_scope: ThreadScope = ThreadScope.BLOCK]
A TileCopier that moves a tile from shared memory into registers.
thread_layout is used as the warp layout. axis-based distribution
is not yet supported, and this copier currently only produces correct
data when src was populated without a swizzle; reading a swizzled
shared-memory tile into local memory is not yet supported.
Parameters
- thread_layout (
Layout): Warp layout describing how threads are organized over the copy. - thread_scope (
ThreadScope): Scope at which thread operations are performed.
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable,
TileCopier
comptime members
dst_address_space
comptime dst_address_space = AddressSpace.LOCAL
Destination AddressSpace this copier writes to.
src_address_space
comptime src_address_space = AddressSpace.SHARED
Source AddressSpace this copier reads from.
Methods
copy
copy[element_size: Int](self, dst: TileTensor[dst.dtype, dst.LayoutType, dst.origin, address_space=SharedToLocalTileCopier[thread_layout, thread_scope=thread_scope].dst_address_space, linear_idx_type=dst.linear_idx_type, element_size=element_size], src: TileTensor[src.dtype, src.LayoutType, src.origin, address_space=SharedToLocalTileCopier[thread_layout, thread_scope=thread_scope].src_address_space, linear_idx_type=src.linear_idx_type, element_size=element_size])
Copies src in shared memory into dst in local memory.
Parameters:
- element_size (
Int): Number of scalar elements per logical element.
Args:
- dst (
TileTensor): Destination tile in local memory. - src (
TileTensor): Source tile in shared memory.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!