Mojo struct
Conv2dSmem
struct Conv2dSmem[act_type: DType, filter_type: DType, out_type: DType, *, config: Conv2dConfig[act_type, filter_type, out_type]]
Shared memory layout for SM100 Conv2D fprop kernel.
This struct manages shared memory allocation for:
- Activation tiles (after im2col transformation)
- Filter tiles
- Output tiles for accumulation
- Synchronization barriers
The layout mirrors B200MatmulSmem but with conv-specific semantics:
- A tiles = im2col'd activation (M x K where M = NHW, K = CRS)
- B tiles = filter (transposed, K x N where K = CRS, N = K_out)
- C tiles = output (M x N)
Parameters
- act_type (
DType): Activation data type. - filter_type (
DType): Filter data type. - out_type (
DType): Output data type. - config (
Conv2dConfig): Kernel configuration.
Fields
- input_tiles (
Conv2dSmem[act_type, filter_type, out_type, config=config].InputTiles): - output_tiles (
Conv2dSmem[act_type, filter_type, out_type, config=config].OutputTiles): - source_tiles (
Conv2dSmem[act_type, filter_type, out_type, config=config].SourceTiles): - pipelines (
Conv2dSmem[act_type, filter_type, out_type, config=config].Pipelines): - epi_load_pipeline (
Conv2dSmem[act_type, filter_type, out_type, config=config].EpiLoadPipeline): - load_order_barrier (
Conv2dSmem[act_type, filter_type, out_type, config=config].LoadOrderBarrier):
Implemented traits
AnyType,
ImplicitlyDestructible
comptime members
__del__is_trivial
comptime __del__is_trivial = True
act_smem_layout
comptime act_smem_layout = tile_layout_k_major[act_type, Conv2dSmem[act_type, filter_type, out_type, config=config].BM, Conv2dSmem[act_type, filter_type, out_type, config=config].BK, config.a_swizzle]()
ActTileArray
comptime ActTileArray = Conv2dSmem[act_type, filter_type, out_type, config=config].InputTiles.ATileArray
BK
comptime BK = config.block_tile_shape.__getitem__[Int](2)
BM
comptime BM = config.block_tile_shape.__getitem__[Int](0)
BN
comptime BN = config.block_tile_shape.__getitem__[Int](1)
EpiLoadBarriers
comptime EpiLoadBarriers = Conv2dSmem[act_type, filter_type, out_type, config=config].EpiLoadPipeline.BarrierArray
EpiLoadPipeline
comptime EpiLoadPipeline = EpiLoadPipelineStorage[Conv2dSmem[act_type, filter_type, out_type, config=config].num_epi_load_stages]
filter_smem_layout
comptime filter_smem_layout = tile_layout_k_major[filter_type, Conv2dSmem[act_type, filter_type, out_type, config=config].BN, Conv2dSmem[act_type, filter_type, out_type, config=config].BK, config.b_swizzle]()
FilterTileArray
comptime FilterTileArray = Conv2dSmem[act_type, filter_type, out_type, config=config].InputTiles.BTileArray
InputTiles
comptime InputTiles = StandardTileStorage[act_type, filter_type, Conv2dSmem[act_type, filter_type, out_type, config=config].BM, Conv2dSmem[act_type, filter_type, out_type, config=config].BK, Conv2dSmem[act_type, filter_type, out_type, config=config].BN, Conv2dSmem[act_type, filter_type, out_type, config=config].BK, Conv2dSmem[act_type, filter_type, out_type, config=config].num_pipeline_stages]
LoadOrderBarrier
comptime LoadOrderBarrier = LoadOrderBarrierStorage
LoadOrderBarriers
comptime LoadOrderBarriers = Conv2dSmem[act_type, filter_type, out_type, config=config].LoadOrderBarrier.BarrierArray
num_accum_pipeline_stages
comptime num_accum_pipeline_stages = config.num_accum_pipeline_stages
num_clc_pipeline_stages
comptime num_clc_pipeline_stages = config.num_clc_pipeline_stages
num_epi_load_stages
comptime num_epi_load_stages = 2
num_group_pipeline_stages
comptime num_group_pipeline_stages = (Conv2dSmem[act_type, filter_type, out_type, config=config].num_pipeline_stages // config)
num_output_stages
comptime num_output_stages = config.num_output_stages
num_pipeline_stages
comptime num_pipeline_stages = config.num_pipeline_stages
out_smem_layout
comptime out_smem_layout = Layout.row_major(Conv2dSmem[act_type, filter_type, out_type, config=config].OutputM, Conv2dSmem[act_type, filter_type, out_type, config=config].OutputN)
OutputM
comptime OutputM = config.output_tile_shape.__getitem__[Int](0)
OutputN
comptime OutputN = config.output_tile_shape.__getitem__[Int](1)
OutputTiles
comptime OutputTiles = OutputTileStorage[out_type, Conv2dSmem[act_type, filter_type, out_type, config=config].OutputM, Conv2dSmem[act_type, filter_type, out_type, config=config].OutputN, Conv2dSmem[act_type, filter_type, out_type, config=config].num_output_stages]
OutTileArray
comptime OutTileArray = Conv2dSmem[act_type, filter_type, out_type, config=config].OutputTiles.CTileArray
Pipelines
comptime Pipelines = SmemPipelineBundle[Conv2dSmem[act_type, filter_type, out_type, config=config].num_group_pipeline_stages, Conv2dSmem[act_type, filter_type, out_type, config=config].num_accum_pipeline_stages, Conv2dSmem[act_type, filter_type, out_type, config=config].num_clc_pipeline_stages, StandardTilePayload[act_type, filter_type, Conv2dSmem[act_type, filter_type, out_type, config=config].BM, Conv2dSmem[act_type, filter_type, out_type, config=config].BK, Conv2dSmem[act_type, filter_type, out_type, config=config].BN, Conv2dSmem[act_type, filter_type, out_type, config=config].BK, Conv2dSmem[act_type, filter_type, out_type, config=config].num_pipeline_stages]]
SourceTiles
comptime SourceTiles = SourceTileStorage[out_type, Conv2dSmem[act_type, filter_type, out_type, config=config].OutputM, Conv2dSmem[act_type, filter_type, out_type, config=config].OutputN, Conv2dSmem[act_type, filter_type, out_type, config=config].num_epi_load_stages]
SrcTileArray
comptime SrcTileArray = Conv2dSmem[act_type, filter_type, out_type, config=config].SourceTiles.SrcTileArray
Methods
act_tiles
act_tiles(ref[AddressSpace._value._mlir_value] self) -> Conv2dSmem[act_type, filter_type, out_type, config=config].ActTileArray
Get activation tiles (im2col'd).
Returns:
Conv2dSmem
filter_tiles
filter_tiles(ref[AddressSpace._value._mlir_value] self) -> Conv2dSmem[act_type, filter_type, out_type, config=config].FilterTileArray
Get filter tiles.
Returns:
Conv2dSmem
out_tiles
out_tiles(ref[AddressSpace._value._mlir_value] self) -> Conv2dSmem[act_type, filter_type, out_type, config=config].OutTileArray
Get output tiles.
Returns:
Conv2dSmem
src_tiles
src_tiles(ref[AddressSpace._value._mlir_value] self) -> Conv2dSmem[act_type, filter_type, out_type, config=config].SrcTileArray
Get source C tiles (for residual operations).
Returns:
Conv2dSmem
epi_load_barriers
epi_load_barriers(ref[AddressSpace._value._mlir_value] self) -> Conv2dSmem[act_type, filter_type, out_type, config=config].EpiLoadBarriers
Get epilogue load pipeline barriers.
Used for synchronization between EpilogueLoad warp (producer) and Epilogue warps (consumers) for source C tensor loading.
Returns:
Conv2dSmem
get_load_order_barrier
get_load_order_barrier(ref[AddressSpace._value._mlir_value] self) -> Conv2dSmem[act_type, filter_type, out_type, config=config].LoadOrderBarriers
Get load order barrier.
Used to coordinate MainLoad warp with EpilogueLoad warp, ensuring epilogue loads don't start before mainloop prologue completes.
Returns:
Conv2dSmem
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!