Mojo struct
EpilogueKStage
struct EpilogueKStage[opc: OutputPipelineConfig, num_input_stages: Int]
Per-K stage for epilogue warp in blockwise FP8.
Returned from EpilogueKContext.__enter__(). Bundles:
- output_stage: TMEM access (offset for reading MMA results)
- input_stage_index: Current A-scales stage
- input_pipeline: For signaling A-scales consumption
Fields
- output_stage (
EpilogueKStage[opc, num_input_stages].OutputStageType): - input_stage_index (
UInt32): - input_pipeline (
EpilogueKStage[opc, num_input_stages].InputPipelineType):
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable,
RegisterPassable,
TrivialRegisterPassable
comptime members
InputPipelineType
comptime InputPipelineType = ProducerConsumerPipeline[num_input_stages]
OutputStageType
comptime OutputStageType = OutputStage[opc]
Methods
__init__
__init__(output_stage: OutputStage[opc], input_stage_index: UInt32, input_pipeline: ProducerConsumerPipeline[num_input_stages]) -> Self
arrive_input
arrive_input(self)
Arrive on the input pipeline's consumer barrier.
Use with lane-guarded patterns: if lane_id() < cluster_size: epi_stage.arrive_input()
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!