Mojo trait
AsyncTileCopier
Trait for asynchronously copying a TileTensor between address spaces.
Distinct from TileCopier because async copies have semantics the
synchronous trait cannot express: on NVIDIA the copy call only
issues the transfer, and callers must commit it via
async_copy_commit_group() and synchronize via
async_copy_wait_all() or async_copy_wait_group() before reading
the destination tile. Keeping the trait separate prevents code
generic over TileCopier from silently accepting an async copier
and producing reads that race the in-flight transfer, and gives the
async path room to grow (e.g., explicit commit/wait members or
multi-stage pipelining) as the async story is fleshed out.
Implemented traits
comptime members
dst_address_space
comptime dst_address_space
Destination AddressSpace the copier writes to.
src_address_space
comptime src_address_space
Source AddressSpace the copier reads from.
Required methods
copy
copy[element_size: Int](self: _Self, dst: TileTensor[dst.dtype, dst.LayoutType, dst.origin, address_space=_Self.dst_address_space, linear_idx_type=dst.linear_idx_type, element_size=element_size], src: TileTensor[src.dtype, src.LayoutType, src.origin, address_space=_Self.src_address_space, linear_idx_type=src.linear_idx_type, element_size=element_size])
Asynchronously copies src into dst.
The copy may not be complete when this call returns; callers
must commit and synchronize before reading dst.
Parameters:
- element_size (
Int): Number of scalar elements per logical element.
Args:
- dst (
TileTensor): Destination tile indst_address_space. - src (
TileTensor): Source tile insrc_address_space.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!