Mojo function
async_copy_with_bound_check
async_copy_with_bound_check[dtype: DType, src_layout: Layout, dst_layout: Layout, //, thread_layout: Layout, swizzle_mode: TensorMapSwizzle](src: LayoutTensor[dtype, src_layout, MutableAnyOrigin, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], dst: LayoutTensor[dtype, dst_layout, MutableAnyOrigin, address_space=AddressSpace(3), element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment])
Helper function for cp.async with boundary checking.
This method performs element-wise async copies with per-element boundary checking. Out-of-bounds accesses are automatically zero-filled, ensuring safe operation near matrix edges.
The method also handles shared memory swizzling to avoid bank conflicts and maximize memory bandwidth utilization.
Template Parameters: dtype: Data type of the elements. src_layout: Layout of the source tile. dst_layout: Layout of the destination tile. thread_layout: Thread arrangement for distributed copying. swizzle_mode: Swizzling pattern for bank conflict avoidance.
Args:
- src (
LayoutTensor
): Source tensor fragment in global memory. - dst (
LayoutTensor
): Destination tensor fragment in shared memory.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!