Skip to main content

Mojo struct

BlockwiseFP8Smem

struct BlockwiseFP8Smem[a_type: DType, b_type: DType, c_type: DType, a_scales_type: DType, transpose_b: Bool, *, config: MatmulConfig[a_type, b_type, c_type, transpose_b]]

SMEM struct for blockwise FP8 matmul with CLC scheduler pipeline.

Thin wrapper over BlockwiseFP8TileCore + SmemPipelineBundle.

Fields​

  • ​core (BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core):
  • ​pipelines (BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Pipelines):

Implemented traits​

AnyType, ImplicitlyDestructible

comptime members​

Core​

comptime Core = BlockwiseFP8TileCore[a_type, b_type, c_type, a_scales_type, transpose_b, config=config]

Pipelines​

comptime Pipelines = SmemPipelineBundle[BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.num_group_pipeline_stages, BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.num_accum_pipeline_stages, config.num_clc_pipeline_stages, BlockwiseFP8TilePayload[a_type, b_type, a_scales_type, IndexList(BlockwiseFP8TileCore[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].BM, BlockwiseFP8TileCore[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].BK, __list_literal__=NoneType(None)), IndexList(BlockwiseFP8TileCore[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].BN, BlockwiseFP8TileCore[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].BK, __list_literal__=NoneType(None)), IndexList(1, BlockwiseFP8TileCore[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].BM, __list_literal__=NoneType(None)), BlockwiseFP8TileCore[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].num_pipeline_stages]]

Methods​

a_tiles​

a_tiles(ref[AddressSpace._value] self) -> BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.ATileArray

Get A tile array accessor.

Returns:

BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.ATileArray

b_tiles​

b_tiles(ref[AddressSpace._value] self) -> BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.BTileArray

Get B tile array accessor.

Returns:

BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.BTileArray

c_tiles​

c_tiles(ref[AddressSpace._value] self) -> BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.CTileArray

Get C tile array accessor.

Returns:

BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.CTileArray

a_scales_tiles​

a_scales_tiles(ref[AddressSpace._value] self) -> BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.AScalesTileArray

Get A-scales tile array accessor.

Returns:

BlockwiseFP8Smem[a_type, b_type, c_type, a_scales_type, transpose_b, config=config].Core.AScalesTileArray

ab_pipeline_size​

static ab_pipeline_size() -> Int

Total size of A+B tiles for all pipeline stages (in elements).

Returns:

Int

a_scales_pipeline_size​

static a_scales_pipeline_size() -> Int

Total size of A-scales tiles for all pipeline stages (in elements).

Returns:

Int

c_output_size​

static c_output_size() -> Int

Size of C tiles for all output stages (in elements).

Returns:

Int

total_tile_size​

static total_tile_size() -> Int

Total tile storage size (A+B+A-scales+C) in elements.

Returns:

Int