Mojo function
tma_wait_pipelined
tma_wait_pipelined[c_type: DType, tma_rank: Int, tile_shape: IndexList[tma_rank], desc_shape: IndexList[tma_rank], is_last_stage: Bool](c_tma_op: TMATensorTile[c_type, tma_rank, tile_shape, desc_shape])
Wait for TMA stores with pipelining.
For SM100 output pipeline:
- Non-last stages: Keep 1 store in flight for pipelining
- Last stage: Wait for all stores to complete
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!