Mojo struct
MMASmemDescriptor
@register_passable(trivial)
struct MMASmemDescriptor
Descriptor for shared memory operands tcgen05 mma instructions.
This struct represents a descriptor that encodes information about shared memory layout and access patterns for warp group matrix multiply operations. The descriptor contains the following bit fields:
Bit field | Size | Description |
---|---|---|
0-13 | 14 | Base address in shared memory |
16-29 | 14 | LBO: leading dim byte offset |
32-45 | 14 | SBO: stride dim byte offset |
46-48 | 3 | Fixed constant value: 0b001 |
49-51 | 3 | Matrix base offset, 0 for canonical layouts |
52 | 1 | Leading dimension stride mode: 0: byte offset relative 1: byte address absolute (only used for 48B K tile) |
53-60 | 8 | Fixed constant value: 0 |
61-63 | 3 | Swizzle mode: 0: No swizzling 1: 128-Byte with 32B atomic swizzling 2: 128-Byte swizzling 4: 64-Byte swizzling 6: 32-Byte swizzling |
Note:
- Some bits are unused.
- Base address, LBO, and SBO ignore 4 least significant bits.
See https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tcgen05-shared-memory-desc-layout
Fields
- desc (
UInt64
): The 64-bit descriptor encodes shared memory operand information.
Implemented traits
AnyType
,
Copyable
,
ImplicitlyCopyable
,
MMAOperandDescriptor
,
Movable
,
UnknownDestructibility
Aliases
__copyinit__is_trivial
alias __copyinit__is_trivial = True
__del__is_trivial
alias __del__is_trivial = True
__moveinit__is_trivial
alias __moveinit__is_trivial = True
Methods
__init__
__init__(val: UInt64) -> Self
Initialize descriptor with raw 64-bit value.
This constructor allows creating a descriptor directly from a 64-bit integer that already contains the properly formatted bit fields for the descriptor.
The implicit attribute enables automatic conversion from UInt64
to MMASmemDescriptor
.
Args:
- val (
UInt64
): A 64-bit integer containing the complete descriptor bit layout.
__add__
__add__(self, offset: Int) -> Self
Add offset to descriptor's base address.
Args:
- offset (
Int
): Byte offset to add to base address.
Returns:
Self
: New descriptor with updated base address.
__iadd__
__iadd__(mut self, offset: Int)
Add offset to descriptor's base address in-place.
Args:
- offset (
Int
): Byte offset to add to base address.
create
static create[stride_byte_offset: Int, leading_byte_offset: Int, swizzle_mode: TensorMapSwizzle = 0](smem_ptr: UnsafePointer[type, address_space=AddressSpace(3), mut=mut, origin=origin]) -> Self
Create a descriptor for shared memory operand.
Parameters:
- stride_byte_offset (
Int
): Stride dimension offset in bytes. - leading_byte_offset (
Int
): Leading dimension stride in bytes. - swizzle_mode (
TensorMapSwizzle
): Memory access pattern mode.
Args:
- smem_ptr (
UnsafePointer
): Pointer to shared memory operand.
Returns:
Self
: Initialized descriptor for the shared memory operand.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!