For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo function
gpu_qint4_repack_GPTQ
def gpu_qint4_repack_GPTQ[group_size: Int, target: StringSlice[StaticConstantOrigin]](b_tt: TileTensor[DType.uint8, Storage=b_tt.Storage, linear_idx_type=b_tt.linear_idx_type, element_size=b_tt.element_size], b_packed_tt: TileTensor[DType.uint8, Storage=b_packed_tt.Storage, linear_idx_type=b_packed_tt.linear_idx_type, element_size=b_packed_tt.element_size], perm_idx: OptionalReg[LayoutTensor[DType.int32, Layout.row_major(Int(-1)), ImmutAnyOrigin]] = None, ctx: Optional[DeviceContext] = None)
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!