Python class
PipelineRegistry
PipelineRegistry
class max.pipelines.lib.registry.PipelineRegistry(architectures)
Bases: object
Registry for managing supported model architectures and their pipelines.
This class maintains a collection of SupportedArchitecture
instances, each defining how a particular model architecture should be
loaded, configured, and executed.
Use PIPELINE_REGISTRY when you want to:
- Register a custom architectures: Call
register()to add a new MAX model architecture to the registry before loading it. - Query supported models: Call
retrieve_architecture()to check if a Hugging Face model repository is supported before attempting to load it. - Access cached configs: Methods like
get_active_huggingface_config()andget_active_tokenizer()provide cached access to model configurations and tokenizers.
-
Parameters:
-
architectures (list[SupportedArchitecture])
get_active_huggingface_config()
get_active_huggingface_config(huggingface_repo)
Retrieves or creates a cached Hugging Face config for the given model.
Maintains a cache of Hugging Face configurations to avoid
reloading them unnecessarily which incurs a Hugging Face Hub API call.
If a config for the given model hasn’t been loaded before, it will
first try AutoConfig.from_pretrained() (for transformers models),
then fall back to loading the raw config.json and creating a
PretrainedConfig via from_dict() (for diffusers components
and other non-transformers models).
Note: The cache key is the HuggingFaceRepo itself, whose hash includes trust_remote_code and subfolder, so configs with different settings are cached separately. For multiprocessing, each worker process has its own registry instance with an empty cache, so configs are loaded fresh in each worker.
-
Parameters:
-
huggingface_repo (HuggingFaceRepo) – The HuggingFaceRepo containing the model.
-
Returns:
-
The Hugging Face configuration object for the model.
-
Raises:
-
FileNotFoundError – If no
config.jsoncan be found for the given repo/subfolder combination. -
Return type:
-
PreTrainedConfig
get_active_tokenizer()
get_active_tokenizer(huggingface_repo)
Retrieves or creates a cached Hugging Face AutoTokenizer for the given model.
Maintains a cache of Hugging Face tokenizers to avoid reloading them unnecessarily which incurs a Hugging Face Hub API call. If a tokenizer for the given model hasn’t been loaded before, it will create a new one using AutoTokenizer.from_pretrained() with the model’s settings.
-
Parameters:
-
huggingface_repo (HuggingFaceRepo) – The HuggingFaceRepo containing the model.
-
Returns:
-
The Hugging Face tokenizer for the model.
-
Return type:
-
PreTrainedTokenizer | PreTrainedTokenizerFast
register()
register(architecture, *, allow_override=False)
Add new architecture to registry.
If multiple architectures share the same name but have different tasks, they are registered in a secondary lookup table keyed by (name, task).
-
Parameters:
-
- architecture (SupportedArchitecture)
- allow_override (bool)
-
Return type:
-
None
reset()
reset()
Clears all registered architectures (mainly for tests).
-
Return type:
-
None
retrieve()
retrieve(pipeline_config, task=PipelineTask.TEXT_GENERATION, override_architecture=None)
Retrieves the tokenizer and an instantiated pipeline for the config.
-
Parameters:
-
- pipeline_config (PipelineConfig)
- task (PipelineTask)
- override_architecture (str | None)
-
Return type:
-
tuple[PipelineTokenizer[Any, Any, Any], PipelineTypes]
retrieve_architecture()
retrieve_architecture(architecture_name, prefer_module_v3=False, task=None)
Retrieve a registered architecture by name.
-
Parameters:
-
- architecture_name (str | None) – The architecture class name to look up
(e.g.
"LlamaForCausalLM"or"FluxPipeline"). - prefer_module_v3 (bool) – Whether to use the eager API architecture variant.
When
False(default), uses the standard graph API architecture name. WhenTrue, appends the_ModuleV3suffix to look up the eager API architecture. - task (PipelineTask | None) – Optional task to disambiguate when multiple architectures share the same name.
- architecture_name (str | None) – The architecture class name to look up
(e.g.
-
Returns:
-
The matching SupportedArchitecture or None if no match found.
-
Return type:
-
SupportedArchitecture | None
retrieve_context_type()
retrieve_context_type(pipeline_config, override_architecture=None, task=None)
Retrieve the context class type associated with the architecture for the given pipeline configuration.
The context type defines how the pipeline manages request state and inputs during model execution. Different architectures may use different context implementations that adhere to either the TextGenerationContext or EmbeddingsContext protocol.
-
Parameters:
-
- pipeline_config (PipelineConfig) – The configuration for the pipeline.
- override_architecture (str | None) – Optional architecture name to use instead of looking up based on the model repository. This is useful for cases like audio generation where the pipeline uses a different architecture (e.g., audio decoder) than the underlying model repository.
- task (PipelineTask | None) – Optional pipeline task to disambiguate when multiple architectures share the same name but serve different tasks.
-
Returns:
-
The context class type associated with the architecture, which implements either the TextGenerationContext or EmbeddingsContext protocol.
-
Raises:
-
ValueError – If no supported architecture is found for the given model repository or override architecture name.
-
Return type:
retrieve_factory()
retrieve_factory(pipeline_config, task=PipelineTask.TEXT_GENERATION, override_architecture=None)
Retrieves the tokenizer and a factory that creates the pipeline instance.
-
Parameters:
-
- pipeline_config (PipelineConfig)
- task (PipelineTask)
- override_architecture (str | None)
-
Return type:
-
tuple[PipelineTokenizer[Any, Any, Any], Callable[[], PipelineTypes]]
retrieve_pipeline_task()
retrieve_pipeline_task(architecture_name)
Retrieves the pipeline task for the given architecture name.
-
Parameters:
-
architecture_name (str | None) – The name of the architecture to look up.
-
Returns:
-
The task associated with the architecture.
-
Raises:
-
- ValueError – If the architecture supports multiple pipeline tasks and the user must specify –task explicitly.
- ValueError – If the architecture is not found in the registry.
-
Return type:
retrieve_tokenizer()
retrieve_tokenizer(pipeline_config, override_architecture=None, task=None)
Retrieves a tokenizer for the given pipeline configuration.
-
Parameters:
-
- pipeline_config (PipelineConfig) – Configuration for the pipeline
- override_architecture (str | None) – Optional architecture override string
- task (PipelineTask | None) – Optional pipeline task to disambiguate when multiple architectures share the same name but serve different tasks.
-
Returns:
-
The configured tokenizer
-
Return type:
-
Raises:
-
ValueError – If no architecture is found
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!