Python class
PipelineConfig
PipelineConfig
class max.pipelines.PipelineConfig(*, config_file=None, section_name=None, debug_verify_replay=False, models=<factory>, model_override=<factory>, sampling=<factory>, profiling=<factory>, lora=None, speculative=None, runtime=<factory>)
Bases: ConfigFileModel
Configuration for a pipeline.
Contains settings for model selection, batch sizing, sampling, profiling, LoRA adapters, and speculative decoding. Once initialized, all fields are resolved to their final values from CLI flags, config files, environment variables, or internal defaults.
-
Parameters:
-
- config_file (str | None)
- section_name (str | None)
- debug_verify_replay (bool)
- models (dict[str, MAXModelConfig])
- model_override (list[str])
- sampling (SamplingConfig)
- profiling (ProfilingConfig)
- lora (LoRAConfig | None)
- speculative (SpeculativeConfig | None)
- runtime (PipelineRuntimeConfig)
configure_session()
configure_session(session)
Configures a InferenceSession with standard pipeline settings.
-
Parameters:
-
session (InferenceSession)
-
Return type:
-
None
debug_verify_replay
debug_verify_replay: bool
Whether to run eager verification before device graph replay.
draft_model
property draft_model: MAXModelConfig | None
The draft model configuration. Alias for models.get("draft").
graph_quantization_encoding
property graph_quantization_encoding: QuantizationEncoding | None
Converts the CLI encoding to a MAX graph quantization encoding.
-
Returns:
-
The graph quantization encoding corresponding to the CLI encoding.
log_basic_config()
log_basic_config()
Log minimal pipeline configuration information.
Logs basic PipelineConfig options including model name, pipeline task,
weight path, max_batch_size, max_seq_len, and reserved memory.
-
Return type:
-
None
log_pipeline_info()
log_pipeline_info()
Logs comprehensive pipeline and KVCache configuration information.
Retrieves all necessary information from self and the PIPELINE_REGISTRY. Raises an error if architecture is not found (which should not happen after config resolution).
-
Return type:
-
None
lora
lora: LoRAConfig | None
The LoRA configuration.
model
property model: MAXModelConfig
The main model config. Alias for models["main"].
model_config
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'strict': False}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
model_override
Per-component model overrides applied before resolution.
model_post_init()
model_post_init(context, /)
This function is meant to behave like a BaseModel method to initialise private attributes.
It takes context as an argument since that’s what pydantic-core passes when calling it.
-
Parameters:
-
- self (BaseModel) – The BaseModel instance.
- context (Any) – The context.
-
Return type:
-
None
models
models: _ModelsType
The model manifest containing all model configs keyed by role.
profiling
profiling: ProfilingConfig
The profiling configuration.
resolve()
resolve()
Validates and resolves the config.
Called after the config is initialized to ensure all config fields are in a valid state.
-
Return type:
-
None
runtime
runtime: PipelineRuntimeConfig
The model-agnostic runtime settings for pipeline execution.
sampling
sampling: SamplingConfig
The sampling configuration.
speculative
speculative: SpeculativeConfig | None
The speculative decoding configuration.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!