Mojo struct
OutputTilePipeline
@register_passable(trivial)
struct OutputTilePipeline[num_stages: Int, stage_stride_cols: Int, cta_group: Int]
Pipeline for MMA→Epilogue TMEM stage synchronization.
Fields
- pipeline (
OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Pipeline): - tmem (
OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Tmem): - mma_complete_mask (
UInt16):
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable
comptime members
__copyinit__is_trivial
comptime __copyinit__is_trivial = True
__del__is_trivial
comptime __del__is_trivial = True
__moveinit__is_trivial
comptime __moveinit__is_trivial = True
BarrierArray
comptime BarrierArray = SMemArray[SharedMemBarrier, (num_stages * 2)]
Pipeline
comptime Pipeline = ProducerConsumerPipeline[num_stages]
Stage
comptime Stage = OutputStage[num_stages, stage_stride_cols, cta_group]
Tmem
comptime Tmem = TmemAllocation[cta_group]
Methods
__init__
__init__(barriers: SMemArray[SharedMemBarrier, (num_stages * 2)], tmem: TmemAllocation[cta_group], mma_complete_mask: UInt16) -> Self
Initialize from barrier array, TMEM allocation, and multicast mask.
init_barriers
static init_barriers(storage_ptr: LegacyUnsafePointer[SharedMemBarrier, address_space=AddressSpace.SHARED], producer_arv_count: Int32, consumer_arv_count: Int32)
Initialize pipeline barriers. Called once by elect_one thread.
acquire_for_mma
acquire_for_mma(self) -> OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Stage
Acquire stage for MMA, waiting for epilogue to finish.
Returns:
OutputTilePipeline
release_from_mma
release_from_mma(mut self, stage: OutputStage[num_stages, stage_stride_cols, cta_group])
Signal MMA completion using mma_arrive (1-SM) or multicast (2-SM).
acquire_for_epilogue
acquire_for_epilogue(self) -> OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Stage
Acquire stage for epilogue, waiting for MMA to complete.
Returns:
OutputTilePipeline
release_from_epilogue
release_from_epilogue(mut self)
Signal epilogue completion, freeing stage for MMA reuse.
producer
producer[origin: MutOrigin, //](ref [origin] self) -> OutputProducer[origin, num_stages, stage_stride_cols, cta_group]
Get producer view for MMA warp.
Returns:
OutputProducer
consumer
consumer[origin: MutOrigin, //](ref [origin] self) -> OutputConsumer[origin, num_stages, stage_stride_cols, cta_group]
Get consumer view for epilogue warp.
Returns:
OutputConsumer
get_pipeline
get_pipeline(self) -> OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Pipeline
Get underlying pipeline (used during barrier initialization).
Returns:
OutputTilePipeline
per_k
per_k[origin: MutOrigin, //](ref [origin] self) -> OutputKPipeline[origin, num_stages, stage_stride_cols, cta_group]
Get per-K-iteration view for kernels with per-K signaling.
Unlike producer()/consumer() which signal once per tile (after all K iterations), this view signals after each K iteration. Use for kernels with per-K accumulation patterns (e.g., blockwise FP8).
Returns:
OutputKPipeline: OutputKPipeline view that provides produce()/consume() context
managers for per-K-iteration barrier signaling.
per_k_epilogue
per_k_epilogue[output_origin: MutOrigin, input_origin: MutOrigin, num_input_stages: Int](ref [output_origin] self, ref [input_origin] input_pipeline: ProducerConsumerPipeline[num_input_stages]) -> EpilogueKContext[output_origin, input_origin, num_stages, stage_stride_cols, cta_group, num_input_stages]
Get combined per-K epilogue context for blockwise FP8.
Bundles output pipeline (MMA->Epilogue sync) and input pipeline (A-scales consumption) into a single context manager.
Example: for k_iter in range(num_iters): with output_pipeline.per_k_epilogue(input_pipeline) as stage: accum.promote(stage, ...) # Both pipelines signaled automatically
Args:
- input_pipeline (
ProducerConsumerPipeline): The input pipeline for A-scales consumption.
Returns:
EpilogueKContext: EpilogueKContext context manager that handles both pipelines.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!