Mojo struct

Q4_KEncoding

struct Q4_KEncoding

The Q4_K quantization encoding.

Because this holds the quantized data in a special packing format, it currently does not print float values at runtime—it's just a bag of bits in uint8 format.

Implemented traits

AnyType, QuantizationEncoding, UnknownDestructibility

Methods

`quantize`

static quantize(tensor: Tensor[float32]) -> Tensor[uint8]

Quantizes the full-precision tensor tensor to Q4_K.

Args:

tensor (Tensor[float32]): Full-precision tensor to quantize. The innermost dimension of the tensor must be a factor of 256.

Returns:

Quantized Q4_K tensor. The tensor datatype is uint8 because this is simply a bytes buffer. Each scalar is actually stored with 4 bits.

Raises:

If the last dimension size is not a factor of 256.

`id`

static id() -> String

Identifier for the Q4_K quantized encoding.

Implemented traits​

Methods​

quantize​

id​

Implemented traits

Methods

`quantize`

`id`