Skip to main content

Python class

DiffusionPipeline

DiffusionPipeline

class max.pipelines.lib.interfaces.DiffusionPipeline(pipeline_config, session, devices, weight_paths, cache_config=None, **kwargs)

source

Bases: ABC

Base class for diffusion pipelines.

Subclasses must define components mapping component names to ComponentModel types.

Parameters:

components

components: dict[str, type[ComponentModel]] | None = None

source

create_cache_state()

create_cache_state(batch_size, seq_len, transformer_config, text_seq_len=0)

source

Create per-request cache state with fresh tensors.

Parameters:

  • batch_size (int) – Batch dimension (from prompt_embeds).
  • seq_len (int) – Sequence length (from latents).
  • transformer_config (Any) – Transformer config carrying dimension info. Must have num_attention_heads, attention_head_dim, patch_size, out_channels, and in_channels attributes.
  • text_seq_len (int) – Text sequence length. Reserved for cache modes that require text-aware allocations.

Return type:

DenoisingCacheState

default_num_inference_steps

default_num_inference_steps: int = 50

source

Default number of denoising steps when the user does not specify one.

Subclasses may override this to provide a model-appropriate default.

default_residual_threshold

default_residual_threshold: float = 0.05

source

Model-specific default for the FBCache relative difference threshold.

Subclasses may override this to provide a model-appropriate default. Used when the request does not specify a residual_threshold.

default_taylorseer_cache_interval

default_taylorseer_cache_interval: int = 5

source

Model-specific default for the TaylorSeer cache interval.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.taylorseer_cache_interval is None.

default_taylorseer_max_order

default_taylorseer_max_order: int = 1

source

Model-specific default for the TaylorSeer expansion order.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.taylorseer_max_order is None.

default_taylorseer_warmup_steps

default_taylorseer_warmup_steps: int = 9

source

Model-specific default for the TaylorSeer warmup steps.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.taylorseer_warmup_steps is None.

default_teacache_coefficients

default_teacache_coefficients: tuple[float, ...] = (498.651651, -283.781631, 55.8554382, -3.82021401, 0.264230861)

source

Default TeaCache polynomial coefficients for FLUX-style rescaling.

default_teacache_rel_l1_thresh

default_teacache_rel_l1_thresh: float = 0.4

source

Model-specific default for the TeaCache relative-L1 threshold.

Subclasses may override this to provide a model-appropriate default. Used when DenoisingCacheConfig.teacache_rel_l1_thresh is None.

execute()

abstract execute(model_inputs, **kwargs)

source

Execute the pipeline with the given model inputs.

Parameters:

  • model_inputs (Any) – Prepared model inputs from prepare_inputs.
  • **kwargs (Any) – Additional pipeline-specific execution parameters.

Returns:

A DiffusionPipelineOutput containing NHWC uint8 images.

Return type:

DiffusionPipelineOutput

init_remaining_components()

abstract init_remaining_components()

source

Initialize non-ComponentModel components (e.g., image processors).

Return type:

None

prepare_inputs()

abstract prepare_inputs(context)

source

Prepare inputs for the pipeline.

Parameters:

context (PixelGenerationContext)

Return type:

Any

run_denoising_step()

run_denoising_step(step, cache_state, device, **kwargs)

source

Execute one denoising step with caching logic.

Delegates the actual transformer call to self.run_transformer(), which subclasses override with model-specific arguments.

Parameters:

  • step (int) – Current step index.
  • cache_state (DenoisingCacheState) – Per-request mutable cache state for this stream.
  • device (Device) – Target device.
  • **kwargs (Any) – Model-specific arguments forwarded to run_transformer.

Returns:

noise_pred tensor for this step.

Return type:

Tensor

run_transformer()

run_transformer(cache_state, **kwargs)

source

Run the transformer for one denoising step.

Subclasses must override this to call their transformer with the appropriate model-specific arguments. The method should return (noise_pred,) when first_block_caching is disabled, or (new_residual, noise_pred) when first_block_caching is enabled.

Parameters:

  • cache_state (DenoisingCacheState) – Per-request mutable cache state for this stream.
  • **kwargs (Any) – Model-specific arguments forwarded from run_denoising_step.

Return type:

tuple[Tensor, …]

unprefixed_weight_component

unprefixed_weight_component: str | None = None

source

When set, weight files without a <component>/ prefix are assigned to this component. This supports multi-repo layouts where quantized weights for one component (e.g. the transformer) are shipped as flat files in a separate repo while the remaining components use the base model repo.