For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

grouped_matmul_block_scaled_mxfp4

def grouped_matmul_block_scaled_mxfp4[out_dtype: DType](c: TileTensor[out_dtype, Storage=c.Storage, address_space=c.address_space, linear_idx_type=c.linear_idx_type, element_size=c.element_size], a: TileTensor[DType.uint8, Storage=a.Storage, address_space=a.address_space, linear_idx_type=a.linear_idx_type, element_size=a.element_size], b: TileTensor[DType.uint8, Storage=b.Storage, address_space=b.address_space, linear_idx_type=b.linear_idx_type, element_size=b.element_size], a_scales: TileTensor[DType.float8_e8m0fnu, Storage=a_scales.Storage, address_space=a_scales.address_space, linear_idx_type=a_scales.linear_idx_type, element_size=a_scales.element_size], b_scales: TileTensor[DType.float8_e8m0fnu, Storage=b_scales.Storage, address_space=b_scales.address_space, linear_idx_type=b_scales.linear_idx_type, element_size=b_scales.element_size], row_offsets: TileTensor[DType.uint32, Storage=row_offsets.Storage, address_space=row_offsets.address_space, linear_idx_type=row_offsets.linear_idx_type, element_size=row_offsets.element_size], expert_ids: TileTensor[DType.int32, Storage=expert_ids.Storage, address_space=expert_ids.address_space, linear_idx_type=expert_ids.linear_idx_type, element_size=expert_ids.element_size], num_active_experts: Int, ctx: DeviceContext)