Mojo struct
BlockwiseFP8_1D2DSmem
struct BlockwiseFP8_1D2DSmem[a_type: DType, b_type: DType, c_type: DType, a_scales_type: DType, transpose_b: Bool, *, config: MatmulConfig[a_type, b_type, c_type, transpose_b]]
SMEM struct for blockwise FP8 1D2D matmul without CLC scheduler.
Thin wrapper over BlockwiseFP8TileCore + SmemPipelineBundleNoClc. Uses 3-warp specialization (Load, MMA, Epilogue) without a scheduler warp.
Fields
- core (
BlockwiseFP8_1D2DSmem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core): - pipelines (
BlockwiseFP8_1D2DSmem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Pipelines):
Implemented traits
AnyType,
ImplicitlyDestructible
comptime members
Core
comptime Core = BlockwiseFP8TileCore[a_type, b_type, c_type, a_scales_type, transpose_b, config=config]
Pipelines
comptime Pipelines = SmemPipelineBundleNoClc[BlockwiseFP8_1D2DSmem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.num_group_pipeline_stages, BlockwiseFP8_1D2DSmem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.num_accum_pipeline_stages, BlockwiseFP8_1D2DSmem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.Payload]
Methods
a_tiles
a_tiles(ref[AddressSpace._value] self) -> BlockwiseFP8_1D2DSmem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.ATileArray
Get A tile array accessor.
Returns:
BlockwiseFP8_1D2DSmem
b_tiles
b_tiles(ref[AddressSpace._value] self) -> BlockwiseFP8_1D2DSmem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.BTileArray
Get B tile array accessor.
Returns:
BlockwiseFP8_1D2DSmem
c_tiles
c_tiles(ref[AddressSpace._value] self) -> BlockwiseFP8_1D2DSmem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.CTileArray
Get C tile array accessor.
Returns:
BlockwiseFP8_1D2DSmem
a_scales_tiles
a_scales_tiles(ref[AddressSpace._value] self) -> BlockwiseFP8_1D2DSmem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.AScalesTileArray
Get A-scales tile array accessor.
Returns:
BlockwiseFP8_1D2DSmem
ab_pipeline_size
static ab_pipeline_size() -> Int
Total size of A+B tiles for all pipeline stages (in elements).
Returns:
a_scales_pipeline_size
static a_scales_pipeline_size() -> Int
Total size of A-scales tiles for all pipeline stages (in elements).
Returns:
c_output_size
total_tile_size
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!