Mojo struct
Q4sym
struct Q4sym[group_size: Int, float_dtype: DType = DType.float32]
Q4sym: compresses values of type float_dtype to 4bit unsigned integers which have been dynamically symmetrically quantized with the given scale factor.
group_size determines the number of elements which share quantization
parameters.
We store things in a strided fashion: Example:
Assume group_size = 8 and we want to process uint4 numbers:
A, B, C, D, E, F, G, H which have associated bits aaaa, bbbb, cccc, ....
eeeeaaaa|ffffbbbb|ggggcccc|hhhhdddd
To uncompress to floating point, take the decoded uint4 value, subtract the implicit zero-point of 2^4=8, and multiply by the scale factor.
Parametersβ
- βgroup_size (
Int): The number of encoded numbers stored in this struct. - βfloat_dtype (
DType): The floating point dtype this struct works with.
Fieldsβ
- βscale (
StaticTuple[UInt8, 2]): The FP16 scale of the group, stored as individual bytes. - βbits (
StaticTuple[UInt8, (group_size // 2)]): The bits of the encoded uint4 numbers.
Implemented traitsβ
AnyType,
Defaultable,
ImplicitlyDestructible
Methodsβ
__init__β
__init__(out self)
Construct a default initialized Q4sym.
__init__(out self, data: SIMD[float_dtype, group_size])
Construct an encoded Q4sym from data.
Args:
- βdata (
SIMD[float_dtype, group_size]): The floating point data to encode and store.
decode_scaleβ
decode_scale(mut self) -> Float16
Obtain the scale factor.
Returns:
Float16: The decoded scale factor.
decode_unsignedβ
decode_unsigned(mut self) -> SIMD[DType.uint8, group_size]
Decode the stored uint4 numbers to uint8.
Returns:
SIMD[DType.uint8, group_size]: The decoded stored numbers as uint8 numbers. These have an implicit
zero-point of 8.
decode_signedβ
decode_signed(mut self) -> SIMD[DType.int8, group_size]
Decode the stored uint4 numbers to requantized int4 numbers.
This is done by simply subtracting an implicit zp of 8 from the unsigned decoding.
Returns:
SIMD[DType.int8, group_size]: The decoded stored numbers as int8 numbers. These have a zero-point of
0.
decode_fullyβ
decode_fully(mut self) -> SIMD[float_dtype, group_size]
Decode the stored numbers into floating point representation.
Returns:
SIMD[float_dtype, group_size]: The decoded numbers.
quantize_and_write_to_tensorβ
static quantize_and_write_to_tensor[input_rank: Int](input_tt: TileTensor[float_dtype, linear_idx_type=input_tt.linear_idx_type, element_size=input_tt.element_size], output_tt: TileTensor[DType.uint8, linear_idx_type=output_tt.linear_idx_type, element_size=output_tt.element_size], input_shape: IndexList[input_rank])
Encodes the floating point numbers in input_tt along the inner-most dimension and writes the result to output_tt.
Args:
- βinput_tt (
TileTensor[float_dtype, linear_idx_type=input_tt.linear_idx_type, element_size=input_tt.element_size]): The input tensor we are encoding. - βoutput_tt (
TileTensor[DType.uint8, linear_idx_type=output_tt.linear_idx_type, element_size=output_tt.element_size]): The output tensor containing the encoded input. The shape of the output should be the same as the input except along the inner dimension where if the original inner dimension wasd, the corresponding output dimension should be: ceil(d/ group_size) * size_of(self). - βinput_shape (
IndexList[input_rank]): The shape of the input tensor.
dequantize_and_write_to_tensorβ
static dequantize_and_write_to_tensor[output_rank: Int](input_tt: TileTensor[DType.uint8, linear_idx_type=input_tt.linear_idx_type, element_size=input_tt.element_size], output_tt: TileTensor[float_dtype, linear_idx_type=output_tt.linear_idx_type, element_size=output_tt.element_size], output_shape: IndexList[output_rank])
Encodes the floating point numbers in input_tt along the inner-most dimension and writes the result to output_tt.
Args:
- βinput_tt (
TileTensor[DType.uint8, linear_idx_type=input_tt.linear_idx_type, element_size=input_tt.element_size]): The input tensor we are decoding. - βoutput_tt (
TileTensor[float_dtype, linear_idx_type=output_tt.linear_idx_type, element_size=output_tt.element_size]): The output tensor containing the decoded input. - βoutput_shape (
IndexList[output_rank]): The shape of the output tensor.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!