Skip to main content

Mojo function

async_copy_with_bound_check

async_copy_with_bound_check[dtype: DType, src_layout: Layout, dst_layout: Layout, //, thread_layout: Layout, swizzle_mode: TensorMapSwizzle](src: LayoutTensor[dtype, src_layout, MutableAnyOrigin, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], dst: LayoutTensor[dtype, dst_layout, MutableAnyOrigin, address_space=AddressSpace(3), element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment])

Helper function for cp.async with boundary checking.

This method performs element-wise async copies with per-element boundary checking. Out-of-bounds accesses are automatically zero-filled, ensuring safe operation near matrix edges.

The method also handles shared memory swizzling to avoid bank conflicts and maximize memory bandwidth utilization.

Template Parameters: dtype: Data type of the elements. src_layout: Layout of the source tile. dst_layout: Layout of the destination tile. thread_layout: Thread arrangement for distributed copying. swizzle_mode: Swizzling pattern for bank conflict avoidance.

Args:

  • src (LayoutTensor): Source tensor fragment in global memory.
  • dst (LayoutTensor): Destination tensor fragment in shared memory.

Was this page helpful?