Mojo struct
OutputTilePipeline
@register_passable(trivial)
struct OutputTilePipeline[num_stages: Int, stage_stride_cols: Int, cta_group: Int]
Pipeline for MMA→Epilogue TMEM stage synchronization.
Fields
- pipeline (
OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Pipeline): - tmem (
OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Tmem): - mma_complete_mask (
UInt16):
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable
comptime members
__copyinit__is_trivial
comptime __copyinit__is_trivial = True
__del__is_trivial
comptime __del__is_trivial = True
__moveinit__is_trivial
comptime __moveinit__is_trivial = True
BarrierArray
comptime BarrierArray = SMemArrayType[SharedMemBarrier, (num_stages * 2)]
Pipeline
comptime Pipeline = ProducerConsumerPipeline[num_stages]
Stage
comptime Stage = OutputStage[num_stages, stage_stride_cols, cta_group]
Tmem
comptime Tmem = TmemAllocation[cta_group]
Methods
__init__
__init__(barriers: SMemArrayType[SharedMemBarrier, (num_stages * 2)], tmem: TmemAllocation[cta_group], mma_complete_mask: UInt16) -> Self
Initialize from barrier array, TMEM allocation, and multicast mask.
init_barriers
static init_barriers(storage_ptr: LegacyUnsafePointer[SharedMemBarrier, address_space=AddressSpace.SHARED], producer_arv_count: Int32, consumer_arv_count: Int32)
Initialize pipeline barriers. Called once by elect_one thread.
acquire_for_mma
acquire_for_mma(self) -> OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Stage
Acquire stage for MMA, waiting for epilogue to finish.
Returns:
OutputTilePipeline
release_from_mma
release_from_mma(mut self, stage: OutputStage[num_stages, stage_stride_cols, cta_group])
Signal MMA completion using mma_arrive (1-SM) or multicast (2-SM).
acquire_for_epilogue
acquire_for_epilogue(self) -> OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Stage
Acquire stage for epilogue, waiting for MMA to complete.
Returns:
OutputTilePipeline
release_from_epilogue
release_from_epilogue(mut self)
Signal epilogue completion, freeing stage for MMA reuse.
producer
producer[origin: MutOrigin, //](ref [origin] self) -> OutputProducer[origin, num_stages, stage_stride_cols, cta_group]
Get producer view for MMA warp.
Returns:
OutputProducer
consumer
consumer[origin: MutOrigin, //](ref [origin] self) -> OutputConsumer[origin, num_stages, stage_stride_cols, cta_group]
Get consumer view for epilogue warp.
Returns:
OutputConsumer
get_pipeline
get_pipeline(self) -> OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Pipeline
Get underlying pipeline (used during barrier initialization).
Returns:
OutputTilePipeline
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!