Skip to main content

Mojo module

epilogue_components

Low-level epilogue components for SM100 matrix multiplication.

This module provides modular building blocks for the output pipeline:

  1. store_fragment_to_smem: Register to shared memory via st.matrix instructions
  2. TMEMToSMemWriter: Write TMEM accumulators to shared memory
  3. TMAStoreExecutor: Execute TMA stores with proper SMEM tiling
  4. EpilogueApplier: Apply element-wise operations on fragments

The SM100 epilogue pipeline flows as: TMEM (accumulators) → Registers → SMEM → GMEM (via TMA)

Structs

Functions

Was this page helpful?