Mojo function
tma_store_with_pipeline
tma_store_with_pipeline[c_type: DType, c_layout: Layout, c_desc_layout: Layout, is_last_stage: Bool](c_tma_op: TMATensorTile[c_type, c_layout, c_desc_layout], src: LayoutTensor[c_type, layout, MutAnyOrigin, address_space=AddressSpace.SHARED, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=128], coords: Tuple[UInt, UInt])
Perform TMA store with pipelined commit and wait.
Encapsulates the common SM100 output pattern:
- fence_async_view_proxy()
- async_store()
- commit_group()
- wait_group() with pipelining
Template Parameters: c_type: Output data type. c_layout: Global memory layout for C. c_desc_layout: TMA descriptor layout for C. is_last_stage: If True, wait for all; else keep 1 in flight.
Args:
- โc_tma_op (
TMATensorTile): TMA tensor tile descriptor. - โsrc (
LayoutTensor): Source shared memory tile. - โcoords (
Tuple): Destination coordinates in global memory.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!