Mojo function

tma_wait_pipelined

tma_wait_pipelined[c_type: DType, c_layout: Layout, c_desc_layout: Layout, is_last_stage: Bool](c_tma_op: TMATensorTile[c_type, c_layout, c_desc_layout])

Wait for TMA stores with pipelining.

For SM100 output pipeline:

Non-last stages: Keep 1 store in flight for pipelining
Last stage: Wait for all stores to complete

View source

Was this page helpful?

Thank you! We'll create more content like this.

Thank you for helping us improve!