Mojo function
tma_store_fence
tma_store_fence()
Establishes a memory fence for shared memory stores in TMA operations.
This function creates a memory barrier that ensures all previous shared memory stores are completed before subsequent TMA (Tensor Memory Access) store operations begin. This is crucial for maintaining memory consistency in tensor operations.
Note: This fence specifically targets the CTA (Cooperative Thread Array) scope and is used to synchronize async shared memory operations.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!