Mojo function
gpu_qint4_repack_GPTQ
gpu_qint4_repack_GPTQ[group_size: Int, target: StringSlice[StaticConstantOrigin]](b: LayoutTensor[DType.uint8, b.layout, b.origin, address_space=b.address_space, element_layout=b.element_layout, layout_int_type=b.layout_int_type, linear_idx_type=b.linear_idx_type, masked=b.masked, alignment=b.alignment], b_packed: LayoutTensor[DType.uint8, b_packed.layout, b_packed.origin, address_space=b_packed.address_space, element_layout=b_packed.element_layout, layout_int_type=b_packed.layout_int_type, linear_idx_type=b_packed.linear_idx_type, masked=b_packed.masked, alignment=b_packed.alignment], perm_idx: OptionalReg[LayoutTensor[DType.int32, Layout.row_major(-1), ImmutAnyOrigin]] = None, ctx: DeviceContextPtr = DeviceContextPtr())
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!