Mojo function
tma_wait_pipelined
tma_wait_pipelined[c_type: DType, c_layout: Layout, c_desc_layout: Layout, is_last_stage: Bool](c_tma_op: TMATensorTile[c_type, c_layout, c_desc_layout])
Wait for TMA stores with pipelining.
For SM100 output pipeline:
- Non-last stages: Keep 1 store in flight for pipelining
- Last stage: Wait for all stores to complete
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!