Skip to main content

Mojo function

tma_store_with_pipeline

tma_store_with_pipeline[c_type: DType, c_layout: Layout, c_desc_layout: Layout, is_last_stage: Bool](c_tma_op: TMATensorTile[c_type, c_layout, c_desc_layout], src: LayoutTensor[c_type, layout, MutAnyOrigin, address_space=AddressSpace.SHARED, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=128], coords: Tuple[UInt, UInt])

Perform TMA store with pipelined commit and wait.

Encapsulates the common SM100 output pattern:

  1. fence_async_view_proxy()
  2. async_store()
  3. commit_group()
  4. wait_group() with pipelining

Template Parameters: c_type: Output data type. c_layout: Global memory layout for C. c_desc_layout: TMA descriptor layout for C. is_last_stage: If True, wait for all; else keep 1 in flight.

Args:

  • โ€‹c_tma_op (TMATensorTile): TMA tensor tile descriptor.
  • โ€‹src (LayoutTensor): Source shared memory tile.
  • โ€‹coords (Tuple): Destination coordinates in global memory.

Was this page helpful?