Python class
GPTQLinear
GPTQLinear
class max.nn.GPTQLinear(in_dim, out_dim, dtype, device, has_bias=False, quantization_encoding=None, quantization_config=None, quant_config=None)
Bases: Linear
A Linear layer for GPTQ encoding.
Initializes the linear layer with weights and optional bias with GPTQ quantization.
Initializes the layer for GPTQ quantized linear transformations.
-
Parameters:
-
- in_dim (int) – The dimensionality of the input space.
- out_dim (int) – The dimensionality of the output space.
- dtype (DType) – The
DTypefor both weights and bias. - device (DeviceRef) – The target
DeviceReffor computation. Weights remain on CPU until moved during computation. - has_bias (bool) – When
True, adds a bias vector to the layer. Defaults toFalse. - quantization_encoding (QuantizationEncoding | None) – The
QuantizationEncodingof the weights. - quantization_config (QuantizationConfig | None) – Extra
QuantizationConfigfor the weight quantization. - quant_config (QuantConfig | None) –
QuantConfigfor scaled quantization (not supported).
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!