Mojo module

mla_decode_sm100

`comptime` values

`logger`

comptime logger = Logger[DEFAULT_LEVEL](stdout, "", False)

`MBarType`

comptime MBarType = UnsafePointer[SharedMemBarrier, MutAnyOrigin, address_space=AddressSpace.SHARED]

`QOTMATile`

comptime QOTMATile[dtype: DType, BM: Int, BK: Int, swizzle_mode: TensorMapSwizzle] = TMATensorTile[dtype, tile_layout_k_major[dtype, BM, BK, swizzle_mode](), _tma_desc_tile_layout[dtype, 2, IndexList[2, DType.int64](BM, BK, Tuple[]()), swizzle_mode]()]

Parameters

dtype (DType):
BM (Int):
BK (Int):
swizzle_mode (TensorMapSwizzle):

`SharedMemPointer`

comptime SharedMemPointer[type: AnyType] = UnsafePointer[type, MutAnyOrigin, address_space=AddressSpace.SHARED]

Parameters

type (AnyType):

`SharedMemTensor`

comptime SharedMemTensor[dtype: DType, layout: Layout] = LayoutTensor[dtype, layout, MutAnyOrigin, address_space=AddressSpace.SHARED, layout_int_type=DType.int32, linear_idx_type=DType.int32, alignment=128]

Parameters

dtype (DType):
layout (Layout):

Structs

DecodeCConsumer:
DecodeCProducer:
DecodeKVConsumer:
DecodeKVProducer:
DecodeOConsumer:
DecodeOProducer:
DecodeOutConsumer:
DecodeOutProducer:
DecodePConsumer:
DecodePProducer:
DecodeSConsumer:
DecodeSM100MiscMBars:
DecodeSM100PVSS:
DecodeSM100QKTSS:
DecodeSProducer:
KVPipelineGeneric: KVPipeline has num_kv_stages * num_qk_stages stages. num_kv_stages refers to how many K and V tiles we pipeline for performing the S = Q@K' and O += P@V MMAs. Each of these MMAs is broken up into num_qk_stages pipelined MMAs. We set step=False for all but the last MMA that completes the operation. An alternative implementation would separate the two, and potentially allow for more overall stages at the cost of slightly more bookkeeping.
MLA_Decode_Pack:
MLA_SM100_Decode:
MLA_SM100_Decode_Config:
OffsetPosition:
OutPipeline: OutPipeline has num_out_stages stages. num_out_stages refers to how many output stages we pipeline for performing the output store.
TMADestination:

comptime values​

logger​

MBarType​

QOTMATile​

Parameters​

SharedMemPointer​

Parameters​

SharedMemTensor​

Parameters​

Structs​

Functions​

`comptime` values

`logger`

`MBarType`

`QOTMATile`

Parameters

`SharedMemPointer`

Parameters

`SharedMemTensor`

Parameters

Structs

Functions