Mojo function
cp_async_bulk_wait_group
cp_async_bulk_wait_group[n: SIMD[int32, 1], read: Bool = True]()
Waits for completion of asynchronous bulk memory transfer groups.
This function causes the executing thread to wait until a specified number of the most recent bulk async-groups are pending. It provides synchronization control for bulk memory transfers on NVIDIA GPUs.
Note: This functionality is only available on NVIDIA GPUs. Attempting to use this function on non-NVIDIA GPUs will result in a compile time error.
Example: ```mojo from gpu.sync import cp_async_bulk_wait_group
# Wait until at most 2 async groups are pending
cp_async_bulk_wait_group[2]()
# Wait for all async groups to complete
cp_async_bulk_wait_group[0]()
```
# Wait until at most 2 async groups are pending
cp_async_bulk_wait_group[2]()
# Wait for all async groups to complete
cp_async_bulk_wait_group[0]()
```
Parameters:
- n (
SIMD[int32, 1]
): The number of most recent bulk async-groups allowed to remain pending. When n=0, waits for all prior bulk async-groups to complete. - read (
Bool
): If True, indicates that subsequent reads to the transferred memory are expected, enabling optimizations for read access patterns. Defaults to True.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!