For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
DecodeSM100QKTSS_FP8
struct DecodeSM100QKTSS_FP8[operand_type: DType, accum_type: DType, *, config: MLA_SM100_Decode_Config]
Implemented traitsβ
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDeletable,
Movable,
RegisterPassable,
TrivialRegisterPassable
comptime membersβ
BKβ
comptime BK = config.BK_QK
MMA_Kβ
comptime MMA_K = 32
MMA_Mβ
comptime MMA_M = config.MMA_M
MMA_Nβ
comptime MMA_N = config.MMA_QK_N
num_k_mmasβ
comptime num_k_mmas = (config // Int(32))
operand_sizeβ
comptime operand_size = size_of[operand_type]()
UMMAInstDescβ
comptime UMMAInstDesc = UMMAInsDescriptor.create[accum_type, operand_type, operand_type, Index[Int, Int, dtype=DType.uint32](config, config)]()
Methodsβ
descriptor_q_blockβ
static def descriptor_q_block(q_smem: UnsafePointer[Scalar[operand_type], MutAnyOrigin, address_space=AddressSpace.SHARED]) -> MMASmemDescriptorPair
Returns:
descriptor_k_blockβ
static def descriptor_k_block(kv_smem: UnsafePointer[Scalar[operand_type], MutAnyOrigin, address_space=AddressSpace.SHARED]) -> MMASmemDescriptorPair
Returns:
mmaβ
static def mma[*, stage_idx: Int = Int(0)](a: MMASmemDescriptorPair, b: MMASmemDescriptorPair, c: UInt32, *, c_scale: UInt32, elect: Int32)
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!