For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo module
quantization
comptime valuesβ
loggerβ
comptime logger = Logger(stdout, prefix=String(""), source_location=False)
Structsβ
- β
GGMLQ40Dequantize: - β
GGMLQ4KDequantize: - β
GGMLQ6KDequantize: - β
QMatmulGPU_b4_g128: - β
QMatmulGPU_b4_g32: - β
QMatmulGPURepackGGUF: - β
QMatmulGPURepackGPTQ_b4_g128: - β
QMatmulGPURepackGPTQ_b4_g128_desc_act: - β
QuantizeDynamicScaledFloat8: - β
QuantizeStaticScaledFloat8: - β
QuantizeTensorDynamicScaledFloat8: - β
ResizeBicubic: - β
ResizeLinear: - β
ResizeNearest: - β
RMSNormFusedQuantizeDynamicScaledFP8: - β
Struct_dequant_mxfp4: - β
Struct_grouped_quantize_dynamic_block_scaled: - β
Struct_interleave_block_scales: - β
Struct_mxfp4_preshuffle_b_5d: Run the AMD CDNA4 MXFP4 B 5D preshuffle as a custom op. - β
Struct_mxfp4_preshuffle_scale_4d_per_expert: Per-step A-scale preshuffle for the AMD CDNA4 preb grouped matmul. - β
Struct_quantize_dynamic_block_scaled: - β
Struct_quantize_dynamic_block_scaled_mxfp4: - β
Struct_unfused_qkv_matmul_ragged_paged_gguf_quantized: - β
VroomQ40Matmul: - β
VroomQ40RepackWeights: - β
VroomQ4KMatmul: - β
VroomQ4KRepackWeights: - β
VroomQ6KMatmul: - β
VroomQ6KRepackWeights:
Functionsβ
- β
composite_rms_norm_fused_quantize_dynamic_scaled_fp8_shape: - β
ggml_q4_0_dequantize_shape: - β
ggml_q4_k_dequantize_shape: - β
ggml_q6_k_dequantize_shape: - β
GGUF_gpu_repack_q4_0_shape: - β
GPTQ_gpu_repack_b4_g128_desc_act_shape: - β
GPTQ_gpu_repack_b4_g128_shape: - β
qmatmul_b4_g128_shape: - β
qmatmul_b4_g32_shape: - β
resize_bicubic_shape: - β
resize_linear_shape: - β
resize_nearest_shape: - β
vroom_q4_0_matmul_shape: - β
vroom_q4_0_repack_weights_shape: - β
vroom_q4_k_matmul_shape: - β
vroom_q4_k_repack_weights_shape: - β
vroom_q6_k_matmul_shape: - β
vroom_q6_k_repack_weights_shape:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!