Skip to main content

Mojo function

tma_wait_pipelined

tma_wait_pipelined[c_type: DType, c_layout: Layout, c_desc_layout: Layout, is_last_stage: Bool](c_tma_op: TMATensorTile[c_type, c_layout, c_desc_layout])

Wait for TMA stores with pipelining.

For SM100 output pipeline:

  • Non-last stages: Keep 1 store in flight for pipelining
  • Last stage: Wait for all stores to complete

Template Parameters: c_type: Output data type. c_layout: Global memory layout for C. c_desc_layout: TMA descriptor layout for C. is_last_stage: If True, wait for all; else keep 1 in flight.

Args:

Was this page helpful?