Python module

max.pipelines.architectures.flux2_modulev3

FLUX.2 diffusion architecture for image generation.

`Flux2ArchConfig`

class max.pipelines.architectures.flux2_modulev3.Flux2ArchConfig(*, max_seq_len=512)

source

Bases: ArchConfig

Pipeline-level config for Flux2 (implements ArchConfig; no KV cache).

Parameters:: max_seq_len (int)

`get_max_seq_len()`

get_max_seq_len()

source

Returns the maximum sequence length for the tokenizer.

Return type:: int

`initialize()`

classmethod initialize(pipeline_config, model_config=None)

source

Initialize the config from a PipelineConfig.

Parameters:

pipeline_config (PipelineConfig) – The pipeline configuration.
model_config (MAXModelConfig | None) – The model configuration to read from. When None (the default), pipeline_config.model is used. Pass an explicit config (e.g. pipeline_config.draft_model) to initialize the arch config for a different model.

Return type:

Self

`max_seq_len`

max_seq_len: int = 512

source

`Flux2Config`

class max.pipelines.architectures.flux2_modulev3.Flux2Config(*, config_file=None, section_name=None, patch_size=1, in_channels=128, out_channels=None, num_layers=8, num_single_layers=48, attention_head_dim=128, num_attention_heads=48, joint_attention_dim=15360, timestep_guidance_channels=256, mlp_ratio=3.0, axes_dims_rope=(32, 32, 32, 32), rope_theta=2000, eps=1e-06, guidance_embeds=True, dtype=bfloat16, device=<factory>, quant_config=None)

source

Bases: MAXModelConfigBase

Parameters:

config_file (str | None)
section_name (str | None)
patch_size (int)
in_channels (int)
out_channels (int | None)
num_layers (int)
num_single_layers (int)
attention_head_dim (int)
num_attention_heads (int)
joint_attention_dim (int)
timestep_guidance_channels (int)
mlp_ratio (float)
axes_dims_rope (tuple[int, ...])
rope_theta (int)
eps (float)
guidance_embeds (bool)
dtype (DType)
device (DeviceRef)
quant_config (QuantConfig | None)

`attention_head_dim`

attention_head_dim: int

source

`axes_dims_rope`

axes_dims_rope: tuple[int, ...]

source

`device`

device: DeviceRef

source

`dtype`

dtype: DType

source

`eps`

eps: float

source

`guidance_embeds`

guidance_embeds: bool

source

If False (Klein/distilled), no guidance embedder weights are expected.

`in_channels`

in_channels: int

source

`initialize_from_config()`

classmethod initialize_from_config(config_dict, encoding, devices)

source

Parameters:

config_dict (dict[str, Any])
encoding (Literal['float32', 'bfloat16', 'q4_k', 'q4_0', 'q6_k', 'float8_e4m3fn', 'float4_e2m1fnx2', 'gptq'])
devices (list[Device])

Return type:

Self

`joint_attention_dim`

joint_attention_dim: int

source

`mlp_ratio`

mlp_ratio: float

source

`model_config`

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'strict': False}

source

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`num_attention_heads`

num_attention_heads: int

source

`num_layers`

num_layers: int

source

`num_single_layers`

num_single_layers: int

source

`out_channels`

out_channels: int | None

source

`patch_size`

patch_size: int

source

`quant_config`

quant_config: QuantConfig | None

source

NVFP4 quantization config, populated when encoding is float4_e2m1fnx2.

`rope_theta`

rope_theta: int

source

`timestep_guidance_channels`

timestep_guidance_channels: int

source

`Flux2TransformerModel`

class max.pipelines.architectures.flux2_modulev3.Flux2TransformerModel(config, encoding, devices, weights, *, cache_config=None)

source

Bases: ComponentModel

Parameters:

config (dict[str, Any])
encoding (SupportedEncoding)
devices (list[Device])
weights (Weights)
cache_config (DenoisingCacheConfig | None)

`load_model()`

load_model()

source

Load and return a runtime model instance.

Return type:: None

`model`

model: Callable[[...], Any]

source

Flux2ArchConfig​

get_max_seq_len()​

initialize()​

max_seq_len​

Flux2Config​

attention_head_dim​

axes_dims_rope​

device​

dtype​

eps​

guidance_embeds​

in_channels​

initialize_from_config()​

joint_attention_dim​

mlp_ratio​

model_config​

num_attention_heads​

num_layers​

num_single_layers​

out_channels​

patch_size​

quant_config​

rope_theta​

timestep_guidance_channels​

Flux2TransformerModel​

load_model()​

model​

`Flux2ArchConfig`

`get_max_seq_len()`

`initialize()`

`max_seq_len`

`Flux2Config`

`attention_head_dim`

`axes_dims_rope`

`device`

`dtype`

`eps`

`guidance_embeds`

`in_channels`

`initialize_from_config()`

`joint_attention_dim`

`mlp_ratio`

`model_config`

`num_attention_heads`

`num_layers`

`num_single_layers`

`out_channels`

`patch_size`

`quant_config`

`rope_theta`

`timestep_guidance_channels`

`Flux2TransformerModel`

`load_model()`

`model`