Mojo struct
TMAStoreCoords
struct TMAStoreCoords[epc: EpilogueConfig, c_smem_shape0: Int, stage: Int, batched: Bool = False]
TMA store coordinates and warp election for SM100 epilogue.
When batched=True, includes a batch coordinate for 3D TMA stores.
Fields
- coord_m (
Int): - coord_n (
Int): - coord_b (
Int): - elect_one_warp (
Bool): - c_smem_coord_m (
Int):
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable,
RegisterPassable,
TrivialRegisterPassable
comptime members
BM
comptime BM = epc.BM
BN
comptime BN = epc.BN
CG1_TMA_BM
comptime CG1_TMA_BM = c_smem_shape0
CG2_TMA_BM
comptime CG2_TMA_BM = c_smem_shape0 if (TMAStoreCoords[epc, c_smem_shape0, stage, batched].MMA_M == 256) else TMAStoreCoords[epc, c_smem_shape0, stage, batched].BM
cta_group
comptime cta_group = epc.cta_group
MMA_M
comptime MMA_M = epc.MMA_M
MMA_N
comptime MMA_N = epc.MMA_N
stage_n_offset
comptime stage_n_offset = (stage * TMAStoreCoords[epc, c_smem_shape0, stage, batched].stageN)
stageN
comptime stageN = epc.stageN
TMA_BM
comptime TMA_BM = c_smem_shape0 if (eq epc.MMA_M, 256) else TMAStoreCoords[epc, c_smem_shape0, stage, batched].BM if (TMAStoreCoords[epc, c_smem_shape0, stage, batched].cta_group == 2) else TMAStoreCoords[epc, c_smem_shape0, stage, batched].CG1_TMA_BM
Methods
__init__
__init__(c_coord: Tuple[UInt32, UInt32], warp_id: UInt32) -> Self
Compute TMA store coordinates from 2D tile coords and warp ID.
__init__(c_coord: Tuple[UInt32, UInt32, UInt32], warp_id: UInt32) -> Self
Compute TMA store coordinates from 3D tile coords and warp ID.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!