Python class
PipelineConfig
PipelineConfigβ
class max.pipelines.PipelineConfig(*, config_file=None, section_name=None, debug_verify_replay=False, models=<factory>, model_override=<factory>, sampling=<factory>, profiling=<factory>, lora=None, speculative=None, runtime=<factory>)
Bases: ConfigFileModel
Configuration for a pipeline.
Contains settings for model selection, batch sizing, sampling, profiling, LoRA adapters, and speculative decoding. Once initialized, all fields are resolved to their final values from CLI flags, config files, environment variables, or internal defaults.
-
Parameters:
-
- config_file (str | None)
- section_name (str | None)
- debug_verify_replay (bool)
- models (dict[str, MAXModelConfig])
- model_override (list[str])
- sampling (SamplingConfig)
- profiling (ProfilingConfig)
- lora (LoRAConfig | None)
- speculative (SpeculativeConfig | None)
- runtime (PipelineRuntimeConfig)
configure_session()β
configure_session(session)
Configures a InferenceSession with standard pipeline settings.
-
Parameters:
-
session (InferenceSession)
-
Return type:
-
None
debug_verify_replayβ
debug_verify_replay: bool
Whether to run eager verification before device graph replay.
draft_modelβ
property draft_model: MAXModelConfig | None
The draft model configuration. Alias for models.get("draft").
graph_quantization_encodingβ
property graph_quantization_encoding: QuantizationEncoding | None
Converts the CLI encoding to a MAX graph quantization encoding.
-
Returns:
-
The graph quantization encoding corresponding to the CLI encoding.
log_basic_config()β
log_basic_config()
Log minimal pipeline configuration information.
Logs basic PipelineConfig options including model name, pipeline task,
weight path, max_batch_size, max_seq_len, and reserved memory.
-
Return type:
-
None
log_pipeline_info()β
log_pipeline_info()
Logs comprehensive pipeline and KVCache configuration information.
Retrieves all necessary information from self and the PIPELINE_REGISTRY. Raises an error if architecture is not found (which should not happen after config resolution).
-
Return type:
-
None
loraβ
lora: LoRAConfig | None
The LoRA configuration.
modelβ
property model: MAXModelConfig
The main model config. Alias for models["main"].
model_configβ
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'strict': False}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
model_overrideβ
Per-component model overrides applied before resolution.
model_post_init()β
model_post_init(context, /)
This function is meant to behave like a BaseModel method to initialise private attributes.
It takes context as an argument since thatβs what pydantic-core passes when calling it.
-
Parameters:
-
- self (BaseModel) β The BaseModel instance.
- context (Any) β The context.
-
Return type:
-
None
modelsβ
models: _ModelsType
The model manifest containing all model configs keyed by role.
profilingβ
profiling: ProfilingConfig
The profiling configuration.
resolve()β
resolve()
Validates and resolves the config.
Called after the config is initialized to ensure all config fields are in a valid state.
-
Return type:
-
None
runtimeβ
runtime: PipelineRuntimeConfig
The model-agnostic runtime settings for pipeline execution.
samplingβ
sampling: SamplingConfig
The sampling configuration.
speculativeβ
speculative: SpeculativeConfig | None
The speculative decoding configuration.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!