For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

matmul_gpu_qint4

def matmul_gpu_qint4[c_type: DType, a_type: DType, //, group_size: Int, target: StringSlice[StaticConstantOrigin], elementwise_lambda_fn: Optional[def[dtype: DType, width: SIMDSize, *, alignment: Int = Int(1)](IndexList[Int(2)], SIMD[dtype, width]) capturing -> None] = None](c_tt: TileTensor[c_type, Storage=c_tt.Storage, linear_idx_type=c_tt.linear_idx_type, element_size=c_tt.element_size], a_tt: TileTensor[a_type, Storage=a_tt.Storage, linear_idx_type=a_tt.linear_idx_type, element_size=a_tt.element_size], b_tt: TileTensor[DType.uint8, Storage=b_tt.Storage, linear_idx_type=b_tt.linear_idx_type, element_size=b_tt.element_size], ctx: Optional[DeviceContext] = None)