Python class

PipelineRegistry

`PipelineRegistry`

class max.pipelines.lib.registry.PipelineRegistry(architectures)

source

Bases: object

Registry for managing supported model architectures and their pipelines.

This class maintains a collection of SupportedArchitecture instances, each defining how a particular model architecture should be loaded, configured, and executed.

Note

Do not instantiate this class directly. Always use the global PIPELINE_REGISTRY singleton, which is automatically populated with all built-in architectures when you import max.pipelines.

Use PIPELINE_REGISTRY when you want to:

Register a custom architectures: Call register() to add a new MAX model architecture to the registry before loading it.
Query supported models: Call retrieve_architecture() to check if a Hugging Face model repository is supported before attempting to load it.
Access cached configs: Methods like get_active_huggingface_config() and get_active_tokenizer() provide cached access to model configurations and tokenizers.

Parameters:: architectures (list[SupportedArchitecture])

`get_active_huggingface_config()`

get_active_huggingface_config(huggingface_repo)

source

Retrieves or creates a cached Hugging Face config for the given model.

Maintains a cache of Hugging Face configurations to avoid reloading them unnecessarily which incurs a Hugging Face Hub API call. If a config for the given model hasn’t been loaded before, it will first try AutoConfig.from_pretrained() (for transformers models), then fall back to loading the raw config.json and creating a PretrainedConfig via from_dict() (for diffusers components and other non-transformers models).

Note: The cache key is the HuggingFaceRepo itself, whose hash includes trust_remote_code and subfolder, so configs with different settings are cached separately. For multiprocessing, each worker process has its own registry instance with an empty cache, so configs are loaded fresh in each worker.

Parameters:: huggingface_repo (HuggingFaceRepo) – The HuggingFaceRepo containing the model.
Returns:: The Hugging Face configuration object for the model.
Raises:: FileNotFoundError – If no config.json can be found for the given repo/subfolder combination.
Return type:: PreTrainedConfig

`get_active_tokenizer()`

get_active_tokenizer(huggingface_repo)

source

Retrieves or creates a cached Hugging Face AutoTokenizer for the given model.

Maintains a cache of Hugging Face tokenizers to avoid reloading them unnecessarily which incurs a Hugging Face Hub API call. If a tokenizer for the given model hasn’t been loaded before, it will create a new one using AutoTokenizer.from_pretrained() with the model’s settings.

Parameters:: huggingface_repo (HuggingFaceRepo) – The HuggingFaceRepo containing the model.
Returns:: The Hugging Face tokenizer for the model.
Return type:: PreTrainedTokenizer | PreTrainedTokenizerFast

`register()`

register(architecture, *, allow_override=False)

source

Add new architecture to registry.

If multiple architectures share the same name but have different tasks, they are registered in a secondary lookup table keyed by (name, task).

Parameters:

architecture (SupportedArchitecture)
allow_override (bool)

Return type:

None

`reset()`

reset()

source

Clears all registered architectures (mainly for tests).

Return type:: None

`retrieve()`

retrieve(pipeline_config, task=PipelineTask.TEXT_GENERATION, override_architecture=None)

source

Retrieves the tokenizer and an instantiated pipeline for the config.

Parameters:

pipeline_config (PipelineConfig)
task (PipelineTask)
override_architecture (str | None)

Return type:

tuple[PipelineTokenizer[Any, Any, Any], PipelineTypes]

`retrieve_architecture()`

retrieve_architecture(architecture_name, prefer_module_v3=False, task=None)

source

Retrieve a registered architecture by name.

Parameters:

architecture_name (str | None) – The architecture class name to look up (e.g. "LlamaForCausalLM" or "FluxPipeline").
prefer_module_v3 (bool) – Whether to use the eager API architecture variant. When False (default), uses the standard graph API architecture name. When True, appends the _ModuleV3 suffix to look up the eager API architecture.
task (PipelineTask | None) – Optional task to disambiguate when multiple architectures share the same name.

Returns:

The matching SupportedArchitecture or None if no match found.

Return type:

SupportedArchitecture | None

`retrieve_context_type()`

retrieve_context_type(pipeline_config, override_architecture=None, task=None)

source

Retrieve the context class type associated with the architecture for the given pipeline configuration.

The context type defines how the pipeline manages request state and inputs during model execution. Different architectures may use different context implementations that adhere to either the TextGenerationContext or EmbeddingsContext protocol.

Parameters:

pipeline_config (PipelineConfig) – The configuration for the pipeline.
override_architecture (str | None) – Optional architecture name to use instead of looking up based on the model repository. This is useful for cases like audio generation where the pipeline uses a different architecture (e.g., audio decoder) than the underlying model repository.
task (PipelineTask | None) – Optional pipeline task to disambiguate when multiple architectures share the same name but serve different tasks.

Returns:

The context class type associated with the architecture, which implements either the TextGenerationContext or EmbeddingsContext protocol.

Raises:

ValueError – If no supported architecture is found for the given model repository or override architecture name.

Return type:

type[TextGenerationContext] | type[EmbeddingsContext]

`retrieve_factory()`

retrieve_factory(pipeline_config, task=PipelineTask.TEXT_GENERATION, override_architecture=None)

source

Retrieves the tokenizer and a factory that creates the pipeline instance.

Parameters:

pipeline_config (PipelineConfig)
task (PipelineTask)
override_architecture (str | None)

Return type:

tuple[PipelineTokenizer[Any, Any, Any], Callable[[], PipelineTypes]]

`retrieve_pipeline_task()`

retrieve_pipeline_task(architecture_name)

source

Retrieves the pipeline task for the given architecture name.

Parameters:

architecture_name (str | None) – The name of the architecture to look up.

Returns:

The task associated with the architecture.

Raises:

ValueError – If the architecture supports multiple pipeline tasks and the user must specify –task explicitly.
ValueError – If the architecture is not found in the registry.

Return type:

PipelineTask

`retrieve_tokenizer()`

retrieve_tokenizer(pipeline_config, override_architecture=None, task=None)

source

Retrieves a tokenizer for the given pipeline configuration.

Parameters:

pipeline_config (PipelineConfig) – Configuration for the pipeline
override_architecture (str | None) – Optional architecture override string
task (PipelineTask | None) – Optional pipeline task to disambiguate when multiple architectures share the same name but serve different tasks.

Returns:

The configured tokenizer

Return type:

PipelineTokenizer

Raises:

ValueError – If no architecture is found

PipelineRegistry​

get_active_huggingface_config()​

get_active_tokenizer()​

register()​

reset()​

retrieve()​

retrieve_architecture()​

retrieve_context_type()​

retrieve_factory()​

retrieve_pipeline_task()​

retrieve_tokenizer()​

`PipelineRegistry`

`get_active_huggingface_config()`

`get_active_tokenizer()`

`register()`

`reset()`

`retrieve()`

`retrieve_architecture()`

`retrieve_context_type()`

`retrieve_factory()`

`retrieve_pipeline_task()`

`retrieve_tokenizer()`