Mojo struct

Q4sym

struct Q4sym[group_size: Int, float_dtype: DType = DType.float32]

Q4sym: compresses values of type float_dtype to 4bit unsigned integers which have been dynamically symmetrically quantized with the given scale factor.

group_size determines the number of elements which share quantization parameters.

We store things in a strided fashion: Example:

Assume group_size = 8 and we want to process uint4 numbers: A, B, C, D, E, F, G, H which have associated bits aaaa, bbbb, cccc, ....

eeeeaaaa|ffffbbbb|ggggcccc|hhhhdddd

To uncompress to floating point, take the decoded uint4 value, subtract the implicit zero-point of 2^4=8, and multiply by the scale factor.

Parameters

group_size (Int): The number of encoded numbers stored in this struct.
float_dtype (DType): The floating point dtype this struct works with.

Fields

scale (StaticTuple[UInt8, 2]): The FP16 scale of the group, stored as individual bytes.
bits (StaticTuple[UInt8, (group_size // 2)]): The bits of the encoded uint4 numbers.

Implemented traits

AnyType, Defaultable, ImplicitlyDestructible

Methods

`init`

__init__(out self)

Construct a default initialized Q4sym.

__init__(out self, data: SIMD[float_dtype, group_size])

Construct an encoded Q4sym from data.

Args:

data (SIMD[float_dtype, group_size]): The floating point data to encode and store.

`decode_scale`

decode_scale(mut self) -> Float16

Obtain the scale factor.

Returns:

Float16: The decoded scale factor.

`decode_unsigned`

decode_unsigned(mut self) -> SIMD[DType.uint8, group_size]

Decode the stored uint4 numbers to uint8.

Returns:

SIMD[DType.uint8, group_size]: The decoded stored numbers as uint8 numbers. These have an implicit zero-point of 8.

`decode_signed`

decode_signed(mut self) -> SIMD[DType.int8, group_size]

Decode the stored uint4 numbers to requantized int4 numbers.

This is done by simply subtracting an implicit zp of 8 from the unsigned decoding.

Returns:

SIMD[DType.int8, group_size]: The decoded stored numbers as int8 numbers. These have a zero-point of 0.

`decode_fully`

decode_fully(mut self) -> SIMD[float_dtype, group_size]

Decode the stored numbers into floating point representation.

Returns:

SIMD[float_dtype, group_size]: The decoded numbers.

`quantize_and_write_to_tensor`

static quantize_and_write_to_tensor[input_rank: Int](input_tt: TileTensor[float_dtype, linear_idx_type=input_tt.linear_idx_type, element_size=input_tt.element_size], output_tt: TileTensor[DType.uint8, linear_idx_type=output_tt.linear_idx_type, element_size=output_tt.element_size], input_shape: IndexList[input_rank])

Encodes the floating point numbers in input_tt along the inner-most dimension and writes the result to output_tt.

Args:

input_tt (TileTensor[float_dtype, linear_idx_type=input_tt.linear_idx_type, element_size=input_tt.element_size]): The input tensor we are encoding.
output_tt (TileTensor[DType.uint8, linear_idx_type=output_tt.linear_idx_type, element_size=output_tt.element_size]): The output tensor containing the encoded input. The shape of the output should be the same as the input except along the inner dimension where if the original inner dimension was d, the corresponding output dimension should be: ceil(d / group_size) * size_of(self).
input_shape (IndexList[input_rank]): The shape of the input tensor.

`dequantize_and_write_to_tensor`

static dequantize_and_write_to_tensor[output_rank: Int](input_tt: TileTensor[DType.uint8, linear_idx_type=input_tt.linear_idx_type, element_size=input_tt.element_size], output_tt: TileTensor[float_dtype, linear_idx_type=output_tt.linear_idx_type, element_size=output_tt.element_size], output_shape: IndexList[output_rank])

Encodes the floating point numbers in input_tt along the inner-most dimension and writes the result to output_tt.

Args:

input_tt (TileTensor[DType.uint8, linear_idx_type=input_tt.linear_idx_type, element_size=input_tt.element_size]): The input tensor we are decoding.
output_tt (TileTensor[float_dtype, linear_idx_type=output_tt.linear_idx_type, element_size=output_tt.element_size]): The output tensor containing the decoded input.
output_shape (IndexList[output_rank]): The shape of the output tensor.

Parameters​

Fields​

Implemented traits​

Methods​

__init__​

decode_scale​

decode_unsigned​

decode_signed​

decode_fully​

quantize_and_write_to_tensor​

dequantize_and_write_to_tensor​

Parameters

Fields

Implemented traits

Methods

`init`

`decode_scale`

`decode_unsigned`

`decode_signed`

`decode_fully`

`quantize_and_write_to_tensor`

`dequantize_and_write_to_tensor`