Python class
GGUFWeights
GGUFWeights
class max.graph.weights.GGUFWeights(source, tensors=None, prefix='', allocated=None)
Bases: Weights
Implementation for loading weights from GGUF (GPT-Generated Unified Format) files.
GGUFWeights provides an interface to load model weights from GGUF files,
which are optimized for quantized large language models. GGUF is the
successor to GGML format and is commonly used in the llama.cpp ecosystem
for efficient storage and loading of quantized models.
from pathlib import Path
from max.graph.weights import GGUFWeights
from max.dtype import DType
from max.graph.quantization import QuantizationEncoding
gguf_path = Path("model-q4_k.gguf")
weights = GGUFWeights(gguf_path)
# Check if a weight exists
if weights.model.layers[0].attention.wq.exists():
# Allocate quantized attention weight
wq_weight = weights.model.layers[0].attention.wq.allocate(
dtype=DType.uint8, # GGUF quantized weights use uint8
device=DeviceRef.CPU()
)
# Access weight data with quantization info
weight_data = weights.model.layers[0].attention.wq.data()
print(f"Quantization: {weight_data.quantization_encoding}")
print(f"Shape: {weight_data.shape}")
# Allocate with quantization validation
ffn_weight = weights.model.layers[0].feed_forward.w1.allocate(
quantization_encoding=QuantizationEncoding.Q4_K,
device=DeviceRef.GPU(0)
)
# Iterate through all weights in a layer
for name, weight in weights.model.layers[0].items():
if weight.exists():
print(f"Found weight: {name}")Creates a GGUF weights reader.
-
Parameters:
allocate()
allocate(dtype=None, shape=None, quantization_encoding=None, device=cpu:0)
Creates and optionally validates a new Weight.
allocated_weights
property allocated_weights: dict[str, DLPackArray]
Gets the values of all weights that were allocated previously.
data()
data()
Loads and returns the weight data for this tensor.
-
Return type:
exists()
exists()
Returns True if a tensor exists for the current prefix.
-
Return type:
items()
items()
Iterate through all allocable weights that start with the prefix.
name
property name: str
The current weight name or prefix.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!