Skip to main content

Mojo struct

OutputKPipeline

@register_passable(trivial) struct OutputKPipeline[origin: MutOrigin, num_stages: Int, stage_stride_cols: Int, cta_group: Int]

Per-K-iteration view of OutputTilePipeline.

Unlike standard producer()/consumer() which signal once per tile (after all K iterations), this view signals after each K iteration. Use for kernels with per-K accumulation patterns (e.g., blockwise FP8).

Example (MMA warp): for i in range(num_iters): with mma_ctx.output_pipeline.per_k().produce() as stage: mma(stage.tmem, ...) # exit signals mma_arrive for this K iteration

Example (Epilogue warp): for k_iter in range(num_iters): with epi_ctx.output_pipeline.per_k().consume() as stage: promote(stage.tmem, ...) # exit signals consumer_step for this K iteration

Fields

  • pipeline_ptr (Pointer[OutputKPipeline[origin, num_stages, stage_stride_cols, cta_group].TilePipelineType, origin]):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable

comptime members

__copyinit__is_trivial

comptime __copyinit__is_trivial = True

__del__is_trivial

comptime __del__is_trivial = True

__moveinit__is_trivial

comptime __moveinit__is_trivial = True

TilePipelineType

comptime TilePipelineType = OutputTilePipeline[num_stages, stage_stride_cols, cta_group]

Methods

__init__

__init__(pipeline_ptr: Pointer[OutputKPipeline[origin, num_stages, stage_stride_cols, cta_group].TilePipelineType, origin]) -> Self

produce

produce(self) -> MmaKStage[origin, num_stages, stage_stride_cols, cta_group]

Get MMA stage context manager for one K iteration.

Returns:

MmaKStage: Context manager that acquires stage on enter and signals mma_arrive on exit.

consume

consume(self) -> PerKConsumerStage[origin, num_stages, stage_stride_cols, cta_group]

Get consumer context manager for one K iteration.

Returns:

PerKConsumerStage: Context manager that waits for MMA on enter and signals consumer_step on exit.

Was this page helpful?