Mojo struct
LoadOrderBarrier
struct LoadOrderBarrier
Barrier for coordinating mainloop load and epilogue load warps.
This barrier implements a simple producer-consumer pattern where the mainloop load warp (producer) signals after completing prologue loads, and the epilogue load warp (consumer) waits before starting C loads.
Protocol:
- Mainloop load warp issues prologue TMA loads
- Mainloop load warp calls arrive()
- Epilogue load warp calls wait() before starting
- Epilogue load warp can now issue TMA loads without contention
This prevents TMA resource contention and ensures proper ordering.
Phase Trackingβ
The barrier uses a single phase bit that toggles per tile iteration. This allows proper synchronization across multiple output tiles.
Fieldsβ
- βbarrier (
MbarPtr): - βphase (
UInt32):
Implemented traitsβ
AnyType,
ImplicitlyDestructible
Methodsβ
__init__β
__init__(out self, ptr: UnsafePointer[SharedMemBarrier, MutAnyOrigin, address_space=AddressSpace.SHARED], initial_phase: UInt32 = UInt32(0))
Initialize the load order barrier.
Args:
- βptr (
UnsafePointer[SharedMemBarrier, MutAnyOrigin, address_space=AddressSpace.SHARED]): Pointer to shared memory barrier. - βinitial_phase (
UInt32): Initial phase (default 0).
initβ
init(self, arrive_count: Int32 = Int32(1))
Initialize the barrier.
Should be called by a single thread (elect_one_thread) during kernel initialization.
Args:
- βarrive_count (
Int32): Number of arrives to expect (default 1 for single mainloop load warp).
arriveβ
arrive(self)
Signal that mainloop prologue loads are complete.
Called by the mainloop load warp after issuing prologue TMA loads.
waitβ
wait(self)
Wait for mainloop to signal prologue completion.
Called by the epilogue load warp before starting C loads.
stepβ
step(mut self)
Toggle phase for next tile iteration.
Called after both arrive and wait have completed to prepare for the next output tile's synchronization.
arrive_and_stepβ
arrive_and_step(mut self)
Arrive and advance phase in one call.
Convenience method for mainloop load warp:
load_order_barrier.arrive_and_step()wait_and_stepβ
wait_and_step(mut self)
Wait and advance phase in one call.
Convenience method for epilogue load warp:
load_order_barrier.wait_and_step()Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!