Skip to main content

Mojo struct

ScalesLoader

@register_passable(trivial) struct ScalesLoader[tma_origin: ImmutOrigin, dtype: DType, tile_layout: TensorLayout, desc_layout: TensorLayout = tile_layout, /, *, cta_group: Int]

TMA scales loader parameterized on new Layout types.

Uses TMATile to derive the TMATensorTile type from new Layout. Uses async_copy (no multicast). Coordinate order is (row_coord, k_coord) matching scales tensor layout.

Fields

  • tma_op (ScalesLoader[tma_origin, dtype, tile_layout, desc_layout, cta_group=cta_group].TmaOpPtr):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable, RegisterPassable, TrivialRegisterPassable

comptime members

__copyinit__is_trivial

comptime __copyinit__is_trivial = True

__del__is_trivial

comptime __del__is_trivial = True

__moveinit__is_trivial

comptime __moveinit__is_trivial = True

TmaOp

comptime TmaOp = TMATile[dtype, tile_layout, desc_layout].InnerType

TmaOpPtr

comptime TmaOpPtr = Pointer[ScalesLoader[tma_origin, dtype, tile_layout, desc_layout, cta_group=cta_group].TmaOp, tma_origin]

Methods

__init__

__init__[tma_op_type: AnyType](tma_op: Pointer[tma_op_type, tma_origin]) -> Self

Accepts any TMA pointer. Rebinds to the loader's derived type.

load

load[LayoutType: TensorLayout](self, dest: TileTensor[dtype, LayoutType, MutAnyOrigin, address_space=AddressSpace.SHARED], ref[AddressSpace._value._mlir_value] barrier: SharedMemBarrier, row_coord: Int, k_coord: Int)

Load scales using TMA async copy.

Was this page helpful?