For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python class

DiffusionPipeline

`DiffusionPipeline`

class max.pipelines.lib.interfaces.DiffusionPipeline(pipeline_config, session, devices, weight_paths, cache_config=None, **kwargs)

source

Bases: ABC

Base class for diffusion pipelines.

Subclasses must define components mapping component names to ComponentModel types.

Parameters:

pipeline_config (PipelineConfig)
session (InferenceSession)
devices (list[Device])
weight_paths (list[Path])
cache_config (DenoisingCacheConfig)
kwargs (Any)

`cache_config`

cache_config: DenoisingCacheConfig

source

`components`

components: dict[str, type[ComponentModel]] | None = None

source

`create_cache_state()`

create_cache_state(batch_size, seq_len, transformer_config, text_seq_len=0)

source

Create per-request cache state with fresh tensors.

Parameters:

batch_size (int) – Batch dimension (from prompt_embeds).
seq_len (int) – Sequence length (from latents).
transformer_config (Any) – Transformer config carrying dimension info. Must have num_attention_heads, attention_head_dim, patch_size, out_channels, and in_channels attributes.
text_seq_len (int) – Text sequence length. Reserved for cache modes that require text-aware allocations.

Return type:

DenoisingCacheState

`default_num_inference_steps`

default_num_inference_steps: int = 50

source

Default number of denoising steps when the user does not specify one.

Subclasses may override this to provide a model-appropriate default.

`default_residual_threshold`

default_residual_threshold: float = 0.05

source

Model-specific default for the FBCache relative difference threshold.

Subclasses may override this to provide a model-appropriate default. Used when the request does not specify a residual_threshold.

`default_taylorseer_cache_interval`

default_taylorseer_cache_interval: int = 5

source

Model-specific default for the TaylorSeer cache interval.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.taylorseer_cache_interval is None.

`default_taylorseer_max_order`

default_taylorseer_max_order: int = 1

source

Model-specific default for the TaylorSeer expansion order.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.taylorseer_max_order is None.

`default_taylorseer_warmup_steps`

default_taylorseer_warmup_steps: int = 9

source

Model-specific default for the TaylorSeer warmup steps.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.taylorseer_warmup_steps is None.

`execute()`

abstract execute(model_inputs, **kwargs)

source

Execute the pipeline with the given model inputs.

Parameters:

model_inputs (Any) – Prepared model inputs from prepare_inputs.
**kwargs (Any) – Additional pipeline-specific execution parameters.

Returns:

A DiffusionPipelineOutput containing NHWC uint8 images.

Return type:

DiffusionPipelineOutput

`init_remaining_components()`

abstract init_remaining_components()

source

Initialize non-ComponentModel components (e.g., image processors).

Return type:: None

`prepare_inputs()`

abstract prepare_inputs(context)

source

Prepare inputs for the pipeline.

Parameters:: context (PixelContext)
Return type:: Any

`run_denoising_step()`

run_denoising_step(step, cache_state, device, **kwargs)

source

Execute one denoising step with caching logic.

Delegates the actual transformer call to self.run_transformer(), which subclasses override with model-specific arguments.

Parameters:

step (int) – Current step index.
cache_state (DenoisingCacheState) – Per-request mutable cache state for this stream.
device (Device) – Target device.
**kwargs (Any) – Model-specific arguments forwarded to run_transformer.

Returns:

noise_pred tensor for this step.

Return type:

Tensor

`run_transformer()`

run_transformer(cache_state, **kwargs)

source

Run the transformer for one denoising step.

Subclasses must override this to call their transformer with the appropriate model-specific arguments. The method should return (noise_pred,) when first_block_caching is disabled, or (new_residual, noise_pred) when first_block_caching is enabled.

Parameters:

cache_state (DenoisingCacheState) – Per-request mutable cache state for this stream.
**kwargs (Any) – Model-specific arguments forwarded from run_denoising_step.

Return type:

tuple[Tensor, …]

`unprefixed_weight_component`

unprefixed_weight_component: str | None = None

source

When set, weight files without a <component>/ prefix are assigned to this component. This supports multi-repo layouts where quantized weights for one component (e.g. the transformer) are shipped as flat files in a separate repo while the remaining components use the base model repo.

DiffusionPipeline​

cache_config​

components​

create_cache_state()​

default_num_inference_steps​

default_residual_threshold​

default_taylorseer_cache_interval​

default_taylorseer_max_order​

default_taylorseer_warmup_steps​

execute()​

init_remaining_components()​

prepare_inputs()​

run_denoising_step()​

run_transformer()​

unprefixed_weight_component​

`DiffusionPipeline`

`cache_config`

`components`

`create_cache_state()`

`default_num_inference_steps`

`default_residual_threshold`

`default_taylorseer_cache_interval`

`default_taylorseer_max_order`

`default_taylorseer_warmup_steps`

`execute()`

`init_remaining_components()`

`prepare_inputs()`

`run_denoising_step()`

`run_transformer()`

`unprefixed_weight_component`