Mojo struct
OutputKPipeline
@register_passable(trivial)
struct OutputKPipeline[origin: MutOrigin, num_stages: Int, stage_stride_cols: Int, cta_group: Int]
Per-K-iteration view of OutputTilePipeline.
Unlike standard producer()/consumer() which signal once per tile (after all K iterations), this view signals after each K iteration. Use for kernels with per-K accumulation patterns (e.g., blockwise FP8).
Example (MMA warp): for i in range(num_iters): with mma_ctx.output_pipeline.per_k().produce() as stage: mma(stage.tmem, ...) # exit signals mma_arrive for this K iteration
Example (Epilogue warp): for k_iter in range(num_iters): with epi_ctx.output_pipeline.per_k().consume() as stage: promote(stage.tmem, ...) # exit signals consumer_step for this K iteration
Fields
- pipeline_ptr (
Pointer[OutputKPipeline[origin, num_stages, stage_stride_cols, cta_group].TilePipelineType, origin]):
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable
comptime members
__copyinit__is_trivial
comptime __copyinit__is_trivial = True
__del__is_trivial
comptime __del__is_trivial = True
__moveinit__is_trivial
comptime __moveinit__is_trivial = True
TilePipelineType
comptime TilePipelineType = OutputTilePipeline[num_stages, stage_stride_cols, cta_group]
Methods
__init__
__init__(pipeline_ptr: Pointer[OutputKPipeline[origin, num_stages, stage_stride_cols, cta_group].TilePipelineType, origin]) -> Self
produce
produce(self) -> MmaKStage[origin, num_stages, stage_stride_cols, cta_group]
Get MMA stage context manager for one K iteration.
Returns:
MmaKStage: Context manager that acquires stage on enter and signals
mma_arrive on exit.
consume
consume(self) -> PerKConsumerStage[origin, num_stages, stage_stride_cols, cta_group]
Get consumer context manager for one K iteration.
Returns:
PerKConsumerStage: Context manager that waits for MMA on enter and signals
consumer_step on exit.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!