Skip to main content

Mojo struct

MMASmemDescriptor

@register_passable(trivial) struct MMASmemDescriptor

Descriptor for shared memory operands tcgen05 mma instructions.

This struct represents a descriptor that encodes information about shared memory layout and access patterns for warp group matrix multiply operations. The descriptor contains the following bit fields:

Bit fieldSizeDescription
0-1314Base address in shared memory
16-2914LBO: leading dim byte offset
32-4514SBO: stride dim byte offset
46-483Fixed constant value: 0b001
49-513Matrix base offset, 0 for canonical layouts
521Leading dimension stride mode:
  0: byte offset relative
  1: byte address absolute
(only used for 48B K tile)
53-608Fixed constant value: 0
61-633Swizzle mode:
  0: No swizzling
  1: 128-Byte with 32B atomic swizzling
  2: 128-Byte swizzling
  4: 64-Byte swizzling
  6: 32-Byte swizzling

Note:

  • Some bits are unused.
  • Base address, LBO, and SBO ignore 4 least significant bits.

See https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tcgen05-shared-memory-desc-layout

Fields

  • desc (UInt64): The 64-bit descriptor encodes shared memory operand information.

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, MMAOperandDescriptor, Movable, UnknownDestructibility

Aliases

__copyinit__is_trivial

alias __copyinit__is_trivial = True

__del__is_trivial

alias __del__is_trivial = True

__moveinit__is_trivial

alias __moveinit__is_trivial = True

Methods

__init__

__init__(val: UInt64) -> Self

Initialize descriptor with raw 64-bit value.

This constructor allows creating a descriptor directly from a 64-bit integer that already contains the properly formatted bit fields for the descriptor.

The implicit attribute enables automatic conversion from UInt64 to MMASmemDescriptor.

Args:

  • val (UInt64): A 64-bit integer containing the complete descriptor bit layout.

__add__

__add__(self, offset: Int) -> Self

Add offset to descriptor's base address.

Args:

  • offset (Int): Byte offset to add to base address.

Returns:

Self: New descriptor with updated base address.

__iadd__

__iadd__(mut self, offset: Int)

Add offset to descriptor's base address in-place.

Args:

  • offset (Int): Byte offset to add to base address.

create

static create[stride_byte_offset: Int, leading_byte_offset: Int, swizzle_mode: TensorMapSwizzle = 0](smem_ptr: UnsafePointer[type, address_space=AddressSpace(3), mut=mut, origin=origin]) -> Self

Create a descriptor for shared memory operand.

Parameters:

  • stride_byte_offset (Int): Stride dimension offset in bytes.
  • leading_byte_offset (Int): Leading dimension stride in bytes.
  • swizzle_mode (TensorMapSwizzle): Memory access pattern mode.

Args:

Returns:

Self: Initialized descriptor for the shared memory operand.

Was this page helpful?