Python class

PipelineConfig

`PipelineConfig`

class max.pipelines.PipelineConfig(*, config_file=None, section_name=None, debug_verify_replay=False, models=<factory>, model_override=<factory>, sampling=<factory>, profiling=<factory>, lora=None, speculative=None, runtime=<factory>)

source

Bases: ConfigFileModel

Configuration for a pipeline.

Contains settings for model selection, batch sizing, sampling, profiling, LoRA adapters, and speculative decoding. Once initialized, all fields are resolved to their final values from CLI flags, config files, environment variables, or internal defaults.

Parameters:

config_file (str | None)
section_name (str | None)
debug_verify_replay (bool)
models (dict[str, MAXModelConfig])
model_override (list[str])
sampling (SamplingConfig)
profiling (ProfilingConfig)
lora (LoRAConfig | None)
speculative (SpeculativeConfig | None)
runtime (PipelineRuntimeConfig)

`configure_session()`

configure_session(session)

source

Configures a InferenceSession with standard pipeline settings.

Parameters:: session (InferenceSession)
Return type:: None

`debug_verify_replay`

debug_verify_replay: bool

source

Whether to run eager verification before device graph replay.

`draft_model`

property draft_model: MAXModelConfig | None

source

The draft model configuration. Alias for models.get("draft").

`graph_quantization_encoding`

property graph_quantization_encoding: QuantizationEncoding | None

source

Converts the CLI encoding to a MAX graph quantization encoding.

Returns:: The graph quantization encoding corresponding to the CLI encoding.

`log_basic_config()`

log_basic_config()

source

Log minimal pipeline configuration information.

Logs basic PipelineConfig options including model name, pipeline task, weight path, max_batch_size, max_seq_len, and reserved memory.

Return type:: None

`log_pipeline_info()`

log_pipeline_info()

source

Logs comprehensive pipeline and KVCache configuration information.

Retrieves all necessary information from self and the PIPELINE_REGISTRY. Raises an error if architecture is not found (which should not happen after config resolution).

Return type:: None

`lora`

lora: LoRAConfig | None

source

The LoRA configuration.

`model`

property model: MAXModelConfig

source

The main model config. Alias for models["main"].

`model_config`

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'strict': False}

source

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`model_override`

model_override: list[str]

source

Per-component model overrides applied before resolution.

`model_post_init()`

model_post_init(context, /)

source

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Parameters:

self (BaseModel) – The BaseModel instance.
context (Any) – The context.

Return type:

None

`models`

models: _ModelsType

source

The model manifest containing all model configs keyed by role.

`profiling`

profiling: ProfilingConfig

source

The profiling configuration.

`resolve()`

resolve()

source

Validates and resolves the config.

Called after the config is initialized to ensure all config fields are in a valid state.

Return type:: None

`runtime`

runtime: PipelineRuntimeConfig

source

The model-agnostic runtime settings for pipeline execution.

`sampling`

sampling: SamplingConfig

source

The sampling configuration.

`speculative`

speculative: SpeculativeConfig | None

source

The speculative decoding configuration.

PipelineConfig​

configure_session()​

debug_verify_replay​

draft_model​

graph_quantization_encoding​

log_basic_config()​

log_pipeline_info()​

lora​

model​

model_config​

model_override​

model_post_init()​

models​

profiling​

resolve()​

runtime​

sampling​

speculative​