Skip to main content

Mojo struct

ScalesLoader

struct ScalesLoader[tma_origin: ImmutOrigin, dtype: DType, tile_layout: TensorLayout, desc_layout: TensorLayout = tile_layout, /, *, cta_group: Int]

TMA scales loader parameterized on new Layout types.

Uses TmaOpType to derive the TMATensorTile type from new Layout. Uses async_copy (no multicast). Coordinate order is (row_coord, k_coord) matching scales tensor layout.

Fields

  • tma_op (ScalesLoader[tma_origin, dtype, tile_layout, desc_layout, cta_group=cta_group].TmaOpPtr):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable, RegisterPassable, TrivialRegisterPassable

comptime members

TmaOp

comptime TmaOp = TMATensorTile[dtype, tile_layout.rank, _to_index_list[tile_layout](), _to_index_list[tile_layout.rank, desc_layout]()]

TmaOpPtr

comptime TmaOpPtr = Pointer[ScalesLoader[tma_origin, dtype, tile_layout, desc_layout, cta_group=cta_group].TmaOp, tma_origin]

Methods

__init__

__init__[tma_op_type: AnyType](tma_op: Pointer[tma_op_type, tma_origin]) -> Self

Accepts any TMA pointer. Rebinds to the loader's derived type.

load

load[LayoutType: TensorLayout](self, dest: TileTensor[dtype, LayoutType, MutAnyOrigin, address_space=AddressSpace.SHARED], ref[AddressSpace._value] barrier: SharedMemBarrier, row_coord: Int, k_coord: Int)

Load scales using TMA async copy.

Was this page helpful?