Skip to main content

Mojo struct

HopperMatmulSM90Kernel_SMem

@register_passable(trivial) struct HopperMatmulSM90Kernel_SMem[a_type: DType, a_layout: Layout, b_type: DType, b_layout: Layout, c_type: DType, c_layout: Layout, num_pipeline_stages: Int]

Shared memory layout for Hopper SM90 matrix multiplication kernel.

This struct manages the shared memory allocation for:

  • Input tiles (A and B matrices) with multi-stage pipelining
  • Output tile (C matrix) for accumulation
  • Synchronization barriers for producer-consumer coordination

The memory is organized to support asynchronous loads and efficient bank-conflict-free access patterns for tensor core operations.

Fields

  • a_tiles (SMemTileArrayType[a_type, a_layout, num_pipeline_stages, 128]):
  • b_tiles (SMemTileArrayType[b_type, b_layout, num_pipeline_stages, 128]):
  • c_tile (LayoutTensor[c_type, c_layout, MutableAnyOrigin, address_space=AddressSpace(3), alignment=128]):
  • full_mbar (SMemArrayType[SharedMemBarrier, num_pipeline_stages]):
  • empty_mbar (SMemArrayType[SharedMemBarrier, num_pipeline_stages]):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, Movable, UnknownDestructibility

Aliases

__copyinit__is_trivial

alias __copyinit__is_trivial = True

__del__is_trivial

alias __del__is_trivial = True

__moveinit__is_trivial

alias __moveinit__is_trivial = True

ATileArray

alias ATileArray = SMemTileArrayType[a_type, a_layout, num_pipeline_stages, 128]

BTileArray

alias BTileArray = SMemTileArrayType[b_type, b_layout, num_pipeline_stages, 128]

CTile

alias CTile = LayoutTensor[c_type, c_layout, MutableAnyOrigin, address_space=AddressSpace(3), alignment=128]

PipelineBarrier

alias PipelineBarrier = SMemArrayType[SharedMemBarrier, num_pipeline_stages]

SMM

alias SMM = SharedMemoryManager[NVIDIASharedMemoryBasePtr]

Methods

__init__

__init__() -> Self

pipeline_storage_size

static pipeline_storage_size() -> Int

Calculate the memory size for all pipeline stages.

Returns:

Int

output_storage_size

static output_storage_size() -> Int

Calculate the memory size for output tile.

Returns:

Int

storage_size

static storage_size() -> Int

Calculate the total storage size.

Returns:

Int

Was this page helpful?