Skip to main content

Mojo struct

EpilogueKStage

struct EpilogueKStage[opc: OutputPipelineConfig, num_input_stages: Int]

Per-K stage for epilogue warp in blockwise FP8.

Returned from EpilogueKContext.__enter__(). Bundles:

  • output_stage: TMEM access (offset for reading MMA results)
  • input_stage_index: Current A-scales stage
  • input_pipeline: For signaling A-scales consumption

Fields

  • output_stage (EpilogueKStage[opc, num_input_stages].OutputStageType):
  • input_stage_index (UInt32):
  • input_pipeline (EpilogueKStage[opc, num_input_stages].InputPipelineType):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable, RegisterPassable, TrivialRegisterPassable

comptime members

InputPipelineType

comptime InputPipelineType = ProducerConsumerPipeline[num_input_stages]

OutputStageType

comptime OutputStageType = OutputStage[opc]

Methods

__init__

__init__(output_stage: OutputStage[opc], input_stage_index: UInt32, input_pipeline: ProducerConsumerPipeline[num_input_stages]) -> Self

arrive_input

arrive_input(self)

Arrive on the input pipeline's consumer barrier.

Use with lane-guarded patterns: if lane_id() < cluster_size: epi_stage.arrive_input()

Was this page helpful?