Mojo struct
ScalesLoader
@register_passable(trivial)
struct ScalesLoader[tma_origin: ImmutOrigin, dtype: DType, tile_layout: TensorLayout, desc_layout: TensorLayout = tile_layout, /, *, cta_group: Int]
TMA scales loader parameterized on new Layout types.
Uses TMATile to derive the TMATensorTile type from new Layout. Uses async_copy (no multicast). Coordinate order is (row_coord, k_coord) matching scales tensor layout.
Fields
- tma_op (
ScalesLoader[tma_origin, dtype, tile_layout, desc_layout, cta_group=cta_group].TmaOpPtr):
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable,
RegisterPassable,
TrivialRegisterPassable
comptime members
__copyinit__is_trivial
comptime __copyinit__is_trivial = True
__del__is_trivial
comptime __del__is_trivial = True
__moveinit__is_trivial
comptime __moveinit__is_trivial = True
TmaOp
comptime TmaOp = TMATile[dtype, tile_layout, desc_layout].InnerType
TmaOpPtr
comptime TmaOpPtr = Pointer[ScalesLoader[tma_origin, dtype, tile_layout, desc_layout, cta_group=cta_group].TmaOp, tma_origin]
Methods
__init__
__init__[tma_op_type: AnyType](tma_op: Pointer[tma_op_type, tma_origin]) -> Self
Accepts any TMA pointer. Rebinds to the loader's derived type.
load
load[LayoutType: TensorLayout](self, dest: TileTensor[dtype, LayoutType, MutAnyOrigin, address_space=AddressSpace.SHARED], ref[AddressSpace._value._mlir_value] barrier: SharedMemBarrier, row_coord: Int, k_coord: Int)
Load scales using TMA async copy.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!