IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python class

DiffusionPipeline

DiffusionPipeline​

class max.pipelines.diffusion.DiffusionPipeline(pipeline_config, session, devices, weight_paths, cache_config=None, **kwargs)

source

Bases: ABC

Base class for diffusion pipelines.

Subclasses must define components mapping component names to ComponentModel types.

Parameters:

components​

components: dict[str, type[ComponentModel]] | None = None

source

create_cache_state()​

create_cache_state(batch_size, seq_len, transformer_config, text_seq_len=0)

source

Create per-request cache state with fresh tensors.

Parameters:

  • batch_size (int) – Batch dimension (from prompt_embeds).
  • seq_len (int) – Sequence length (from latents).
  • transformer_config (Any) – Transformer config carrying dimension info. Must have num_attention_heads, attention_head_dim, patch_size, out_channels, and in_channels attributes.
  • text_seq_len (int) – Text sequence length. Reserved for cache modes that require text-aware allocations.

Return type:

DenoisingCacheState

default_num_inference_steps​

default_num_inference_steps: int = 50

source

Default number of denoising steps when the user does not specify one.

Subclasses may override this to provide a model-appropriate default.

default_residual_threshold​

default_residual_threshold: float = 0.05

source

Model-specific default for the FBCache relative difference threshold.

Subclasses may override this to provide a model-appropriate default. Used when the request does not specify a residual_threshold.

default_taylorseer_cache_interval​

default_taylorseer_cache_interval: int = 5

source

Model-specific default for the TaylorSeer cache interval.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.taylorseer_cache_interval is None.

default_taylorseer_max_order​

default_taylorseer_max_order: int = 1

source

Model-specific default for the TaylorSeer expansion order.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.taylorseer_max_order is None.

default_taylorseer_warmup_steps​

default_taylorseer_warmup_steps: int = 9

source

Model-specific default for the TaylorSeer warmup steps.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.taylorseer_warmup_steps is None.

execute()​

abstract execute(model_inputs, **kwargs)

source

Execute the pipeline with the given model inputs.

Parameters:

  • model_inputs (Any) – Prepared model inputs from prepare_inputs.
  • **kwargs (Any) – Additional pipeline-specific execution parameters.

Returns:

A DiffusionPipelineOutput containing NHWC uint8 images.

Return type:

DiffusionPipelineOutput

init_remaining_components()​

abstract init_remaining_components()

source

Initialize non-ComponentModel components (e.g., image processors).

Return type:

None

prepare_inputs()​

abstract prepare_inputs(context)

source

Prepare inputs for the pipeline.

Parameters:

context (PixelGenerationContext)

Return type:

Any

run_denoising_step()​

run_denoising_step(step, cache_state, device, **kwargs)

source

Execute one denoising step with caching logic.

Delegates the actual transformer call to self.run_transformer(), which subclasses override with model-specific arguments.

Parameters:

  • step (int) – Current step index.
  • cache_state (DenoisingCacheState) – Per-request mutable cache state for this stream.
  • device (Device) – Target device.
  • **kwargs (Any) – Model-specific arguments forwarded to run_transformer.

Returns:

noise_pred tensor for this step.

Return type:

Tensor

run_transformer()​

run_transformer(cache_state, **kwargs)

source

Run the transformer for one denoising step.

Subclasses must override this to call their transformer with the appropriate model-specific arguments. The method should return (noise_pred,) when first_block_caching is disabled, or (new_residual, noise_pred) when first_block_caching is enabled.

Parameters:

  • cache_state (DenoisingCacheState) – Per-request mutable cache state for this stream.
  • **kwargs (Any) – Model-specific arguments forwarded from run_denoising_step.

Return type:

tuple[Tensor, …]

unprefixed_weight_component​

unprefixed_weight_component: str | None = None

source

When set, weight files without a <component>/ prefix are assigned to this component. This supports multi-repo layouts where quantized weights for one component (e.g. the transformer) are shipped as flat files in a separate repo while the remaining components use the base model repo.