For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python module

max.pipelines.architectures.z_image_modulev3

Z-Image diffusion architecture for image generation.

`ZImageArchConfig`

class max.pipelines.architectures.z_image_modulev3.ZImageArchConfig(*, pipeline_config: 'PipelineConfig')

source

Bases: ArchConfig

Parameters:: pipeline_config (PipelineConfig)

`get_max_seq_len()`

get_max_seq_len()

source

Returns the default maximum sequence length for the model.

Subclasses should determine whether this value can be overridden by setting the --max-length (pipeline_config.model.max_length) flag.

Return type:: int

`initialize()`

classmethod initialize(pipeline_config, model_config=None)

source

Initialize the config from a PipelineConfig.

Parameters:

pipeline_config (PipelineConfig) – The pipeline configuration.
model_config (MAXModelConfig | None) – The model configuration to read from. When None (the default), pipeline_config.model is used. Pass an explicit config (e.g. pipeline_config.draft_model) to initialize the arch config for a different model.

Return type:

Self

`pipeline_config`

pipeline_config: PipelineConfig

source

`ZImageConfig`

class max.pipelines.architectures.z_image_modulev3.ZImageConfig(*, config_file=None, section_name=None, all_patch_size=(2, ), all_f_patch_size=(1, ), in_channels=16, dim=3840, n_layers=30, n_refiner_layers=2, n_heads=30, n_kv_heads=30, norm_eps=1e-05, qk_norm=True, cap_feat_dim=2560, rope_theta=256.0, t_scale=1000.0, axes_dims=(32, 48, 48), axes_lens=(1024, 512, 512), dtype=bfloat16, device=<factory>)

source

Bases: MAXModelConfigBase

Parameters:

config_file (str | None)
section_name (str | None)
all_patch_size (tuple[int, ...])
all_f_patch_size (tuple[int, ...])
in_channels (int)
dim (int)
n_layers (int)
n_refiner_layers (int)
n_heads (int)
n_kv_heads (int)
norm_eps (float)
qk_norm (bool)
cap_feat_dim (int)
rope_theta (float)
t_scale (float)
axes_dims (tuple[int, ...])
axes_lens (tuple[int, ...])
dtype (DType)
device (DeviceRef)

`all_f_patch_size`

all_f_patch_size: tuple[int, ...]

source

`all_patch_size`

all_patch_size: tuple[int, ...]

source

`axes_dims`

axes_dims: tuple[int, ...]

source

`axes_lens`

axes_lens: tuple[int, ...]

source

`cap_feat_dim`

cap_feat_dim: int

source

`device`

device: DeviceRef

source

`dim`

dim: int

source

`dtype`

dtype: DType

source

`fbcache_dims()`

fbcache_dims()

source

(hidden_dim, output_dim) per image token for FBCache / Taylor tensors.

Return type:: tuple[int, int]

`in_channels`

in_channels: int

source

`initialize_from_config()`

classmethod initialize_from_config(config_dict, encoding, devices)

source

Parameters:

config_dict (dict[str, Any])
encoding (Literal['float32', 'bfloat16', 'q4_k', 'q4_0', 'q6_k', 'float8_e4m3fn', 'float4_e2m1fnx2', 'gptq'])
devices (list[Device])

Return type:

Self

`model_config`

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'strict': False}

source

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`n_heads`

n_heads: int

source

`n_kv_heads`

n_kv_heads: int

source

`n_layers`

n_layers: int

source

`n_refiner_layers`

n_refiner_layers: int

source

`norm_eps`

norm_eps: float

source

`qk_norm`

qk_norm: bool

source

`rope_theta`

rope_theta: float

source

`t_scale`

t_scale: float

source

`ZImageTransformerModel`

class max.pipelines.architectures.z_image_modulev3.ZImageTransformerModel(config, encoding, devices, weights, *, cache_config=None)

source

Bases: ComponentModel

Component wrapper for the compiled Z-Image transformer graph.

Parameters:

config (dict[str, Any])
encoding (SupportedEncoding)
devices (list[Device])
weights (Weights)
cache_config (DenoisingCacheConfig | None)

`load_model()`

load_model()

source

Load and return a runtime model instance.

Return type:: None

`model`

model: Callable[[...], Any]

source

ZImageArchConfig​

get_max_seq_len()​

initialize()​

pipeline_config​

ZImageConfig​

all_f_patch_size​

all_patch_size​

axes_dims​

axes_lens​

cap_feat_dim​

device​

dim​

dtype​

fbcache_dims()​

in_channels​

initialize_from_config()​

model_config​

n_heads​

n_kv_heads​

n_layers​

n_refiner_layers​

norm_eps​

qk_norm​

rope_theta​

t_scale​

ZImageTransformerModel​

load_model()​

model​

`ZImageArchConfig`

`get_max_seq_len()`

`initialize()`

`pipeline_config`

`ZImageConfig`

`all_f_patch_size`

`all_patch_size`

`axes_dims`

`axes_lens`

`cap_feat_dim`

`device`

`dim`

`dtype`

`fbcache_dims()`

`in_channels`

`initialize_from_config()`

`model_config`

`n_heads`

`n_kv_heads`

`n_layers`

`n_refiner_layers`

`norm_eps`

`qk_norm`

`rope_theta`

`t_scale`

`ZImageTransformerModel`

`load_model()`

`model`