Mojo struct
Q4_KEncoding
The Q4_K quantization encoding.
Because this holds the quantized data in a special packing format, it currently does not print float values at runtimeβit's just a bag of bits in uint8 format.
Implemented traitsβ
AnyType
,
QuantizationEncoding
Methodsβ
quantize
β
static quantize(tensor: Tensor[float32]) -> Tensor[uint8]
Quantizes the full-precision tensor tensor
to Q4_K.
Args:
- βtensor (
Tensor[float32]
): Full-precision tensor to quantize. The innermost dimension of the tensor must be a factor of 256.
Returns:
Quantized Q4_K tensor. The tensor datatype is uint8
because this is simply a bytes buffer. Each scalar is actually stored with 4 bits.
Raises:
If the last dimension size is not a factor of 256.
id
β
static id() -> String
Identifier for the Q4_K quantized encoding.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!
π What went wrong?