Python module
hf_utils
Utilities for interacting with HuggingFace Files/Repos.
HuggingFaceFile
class max.pipelines.lib.hf_utils.HuggingFaceFile(repo_id, filename, revision=None)
A simple object for tracking Hugging Face model metadata. The repo_id will frequently be used to load a tokenizer, whereas the filename is used to download model weights.
download()
download(force_download=False)
Download the file and return the file path where the data is saved locally.
exists()
exists()
-
Return type:
filename
filename*: str*
repo_id
repo_id*: str*
revision
size()
size()
-
Return type:
-
int | None
HuggingFaceRepo
class max.pipelines.lib.hf_utils.HuggingFaceRepo(repo_id, revision='main', trust_remote_code=False, repo_type=None)
A class for interacting with HuggingFace Repos.
download()
download(filename, force_download=False)
encoding_for_file()
encoding_for_file(file)
file_exists()
file_exists(filename)
files_for_encoding()
files_for_encoding(encoding, weights_format=None)
formats_available
property formats_available*: list[WeightsFormat]*
info
property info*: ModelInfo*
repo_id
repo_id*: str*
The HuggingFace repo id. While it’s called repo_id, it can be a HF remote or local path altogether.
repo_type
repo_type*: RepoType | None* = None
The type of repo. This is inferred from the repo_id.
revision
revision*: str* = 'main'
The revision to use for the repo.
size_of()
size_of(filename)
supported_encodings
property supported_encodings*: list[SupportedEncoding]*
trust_remote_code
trust_remote_code*: bool* = False
Whether to trust remote code.
weight_files
download_weight_files()
max.pipelines.lib.hf_utils.download_weight_files(huggingface_model_id, filenames, revision=None, force_download=False, max_workers=8)
- Provided a HuggingFace model id, and filenames, download weight files
- and return the list of local paths.
-
Parameters:
-
- huggingface_model_id (
str
) – The huggingface model identifier, ie. modularai/Llama-3.1-8B-Instruct-GGUF - filenames (
list
[
str
]
) – A list of file paths relative to the root of the HuggingFace repo. If files provided are available locally, download is skipped, and the local files are used. - revision (
str
|
None
) – The HuggingFace revision to use. If provided, we check our cache directly without needing to go to HuggingFace directly, saving a network call. - force_download (
bool
) – A boolean, indicating whether we should force the files to be redownloaded, even if they are already available in our local cache, or a provided path. - max_workers (
int
) – The number of worker threads to concurrently download files.
- huggingface_model_id (
-
Return type:
generate_local_model_path()
max.pipelines.lib.hf_utils.generate_local_model_path(repo_id, revision)
Generate the local filesystem path where a HuggingFace model repo is cached.
This function takes a HuggingFace repository ID and revision hash and returns the full local filesystem path where the model files are cached by the huggingface_hub library. The path follows the standard HuggingFace caching convention of: ~/.cache/huggingface/hub/models–{org}–{model}/snapshots/{revision}
-
Parameters:
-
Returns:
-
The absolute path to the cached model files for the specified revision. For example: “~/.cache/huggingface/hub/models–HuggingFaceTB–SmolLM2-135M/snapshots/abc123”
-
Return type:
-
Raises:
-
FileNotFoundError – If the model path does not exist locally
repo_exists_with_retry()
max.pipelines.lib.hf_utils.repo_exists_with_retry(repo_id, revision)
Wrapper around huggingface_hub.revision_exists with retry logic. Uses exponential backoff with 25% jitter, starting at 1s and doubling each retry.
We use revision_exists here instead of repo_exists because repo_exists does not take in a revision parameter.
See huggingface_hub.revision_exists for details
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!