Skip to main content

Python module

max.pipelines.architectures.flux2_modulev3

FLUX.2 diffusion architecture for image generation.

Flux2ArchConfig​

class max.pipelines.architectures.flux2_modulev3.Flux2ArchConfig(*, max_seq_len=512)

source

Bases: ArchConfig

Pipeline-level config for Flux2 (implements ArchConfig; no KV cache).

Parameters:

max_seq_len (int)

get_max_seq_len()​

get_max_seq_len()

source

Returns the maximum sequence length for the tokenizer.

Return type:

int

initialize()​

classmethod initialize(pipeline_config, model_config=None)

source

Initialize the config from a PipelineConfig.

Parameters:

  • pipeline_config (PipelineConfig) – The pipeline configuration.
  • model_config (MAXModelConfig | None) – The model configuration to read from. When None (the default), pipeline_config.model is used. Pass an explicit config (e.g. pipeline_config.draft_model) to initialize the arch config for a different model.

Return type:

Self

max_seq_len​

max_seq_len: int = 512

source

Flux2Config​

class max.pipelines.architectures.flux2_modulev3.Flux2Config(*, config_file=None, section_name=None, patch_size=1, in_channels=128, out_channels=None, num_layers=8, num_single_layers=48, attention_head_dim=128, num_attention_heads=48, joint_attention_dim=15360, timestep_guidance_channels=256, mlp_ratio=3.0, axes_dims_rope=(32, 32, 32, 32), rope_theta=2000, eps=1e-06, guidance_embeds=True, dtype=bfloat16, device=<factory>, quant_config=None)

source

Bases: MAXModelConfigBase

Parameters:

  • config_file (str | None)
  • section_name (str | None)
  • patch_size (int)
  • in_channels (int)
  • out_channels (int | None)
  • num_layers (int)
  • num_single_layers (int)
  • attention_head_dim (int)
  • num_attention_heads (int)
  • joint_attention_dim (int)
  • timestep_guidance_channels (int)
  • mlp_ratio (float)
  • axes_dims_rope (tuple[int, ...])
  • rope_theta (int)
  • eps (float)
  • guidance_embeds (bool)
  • dtype (DType)
  • device (DeviceRef)
  • quant_config (QuantConfig | None)

attention_head_dim​

attention_head_dim: int

source

axes_dims_rope​

axes_dims_rope: tuple[int, ...]

source

device​

device: DeviceRef

source

dtype​

dtype: DType

source

eps​

eps: float

source

guidance_embeds​

guidance_embeds: bool

source

If False (Klein/distilled), no guidance embedder weights are expected.

in_channels​

in_channels: int

source

initialize_from_config()​

classmethod initialize_from_config(config_dict, encoding, devices)

source

Parameters:

  • config_dict (dict[str, Any])
  • encoding (Literal['float32', 'bfloat16', 'q4_k', 'q4_0', 'q6_k', 'float8_e4m3fn', 'float4_e2m1fnx2', 'gptq'])
  • devices (list[Device])

Return type:

Self

joint_attention_dim​

joint_attention_dim: int

source

mlp_ratio​

mlp_ratio: float

source

model_config​

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'strict': False}

source

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_attention_heads​

num_attention_heads: int

source

num_layers​

num_layers: int

source

num_single_layers​

num_single_layers: int

source

out_channels​

out_channels: int | None

source

patch_size​

patch_size: int

source

quant_config​

quant_config: QuantConfig | None

source

NVFP4 quantization config, populated when encoding is float4_e2m1fnx2.

rope_theta​

rope_theta: int

source

timestep_guidance_channels​

timestep_guidance_channels: int

source

Flux2TransformerModel​

class max.pipelines.architectures.flux2_modulev3.Flux2TransformerModel(config, encoding, devices, weights, *, cache_config=None)

source

Bases: ComponentModel

Parameters:

load_model()​

load_model()

source

Load and return a runtime model instance.

Return type:

None

model​

model: Callable[[...], Any]

source