Skip to main content

Mojo struct

OutputTilePipeline

@register_passable(trivial) struct OutputTilePipeline[num_stages: Int, stage_stride_cols: Int, cta_group: Int]

Pipeline for MMA→Epilogue TMEM stage synchronization.

Fields

  • pipeline (OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Pipeline):
  • tmem (OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Tmem):
  • mma_complete_mask (UInt16):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable

comptime members

__copyinit__is_trivial

comptime __copyinit__is_trivial = True

__del__is_trivial

comptime __del__is_trivial = True

__moveinit__is_trivial

comptime __moveinit__is_trivial = True

BarrierArray

comptime BarrierArray = SMemArrayType[SharedMemBarrier, (num_stages * 2)]

Pipeline

comptime Pipeline = ProducerConsumerPipeline[num_stages]

Stage

comptime Stage = OutputStage[num_stages, stage_stride_cols, cta_group]

Tmem

comptime Tmem = TmemAllocation[cta_group]

Methods

__init__

__init__(barriers: SMemArrayType[SharedMemBarrier, (num_stages * 2)], tmem: TmemAllocation[cta_group], mma_complete_mask: UInt16) -> Self

Initialize from barrier array, TMEM allocation, and multicast mask.

init_barriers

static init_barriers(storage_ptr: LegacyUnsafePointer[SharedMemBarrier, address_space=AddressSpace.SHARED], producer_arv_count: Int32, consumer_arv_count: Int32)

Initialize pipeline barriers. Called once by elect_one thread.

acquire_for_mma

acquire_for_mma(self) -> OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Stage

Acquire stage for MMA, waiting for epilogue to finish.

Returns:

OutputTilePipeline

release_from_mma

release_from_mma(mut self, stage: OutputStage[num_stages, stage_stride_cols, cta_group])

Signal MMA completion using mma_arrive (1-SM) or multicast (2-SM).

acquire_for_epilogue

acquire_for_epilogue(self) -> OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Stage

Acquire stage for epilogue, waiting for MMA to complete.

Returns:

OutputTilePipeline

release_from_epilogue

release_from_epilogue(mut self)

Signal epilogue completion, freeing stage for MMA reuse.

producer

producer[origin: MutOrigin, //](ref [origin] self) -> OutputProducer[origin, num_stages, stage_stride_cols, cta_group]

Get producer view for MMA warp.

Returns:

OutputProducer

consumer

consumer[origin: MutOrigin, //](ref [origin] self) -> OutputConsumer[origin, num_stages, stage_stride_cols, cta_group]

Get consumer view for epilogue warp.

Returns:

OutputConsumer

get_pipeline

get_pipeline(self) -> OutputTilePipeline[num_stages, stage_stride_cols, cta_group].Pipeline

Get underlying pipeline (used during barrier initialization).

Returns:

OutputTilePipeline

Was this page helpful?