IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python module

max.pipelines.architectures.z_image_modulev3

Z-Image diffusion architecture for image generation.

ZImageArchConfig​

class max.pipelines.architectures.z_image_modulev3.ZImageArchConfig(*, pipeline_config: 'PipelineConfig')

source

Bases: ArchConfig

Parameters:

pipeline_config (PipelineConfig)

get_max_seq_len()​

get_max_seq_len()

source

Returns the default maximum sequence length for the model.

Subclasses should determine whether this value can be overridden by setting the --max-length (pipeline_config.model.max_length) flag.

Return type:

int

initialize()​

classmethod initialize(pipeline_config, model_config=None)

source

Initialize the config from a PipelineConfig.

Parameters:

  • pipeline_config (PipelineConfig) – The pipeline configuration.
  • model_config (MAXModelConfig | None) – The model configuration to read from. When None (the default), pipeline_config.model is used. Pass an explicit config (e.g. pipeline_config.draft_model) to initialize the arch config for a different model.

Return type:

Self

pipeline_config​

pipeline_config: PipelineConfig

source

ZImageConfig​

class max.pipelines.architectures.z_image_modulev3.ZImageConfig(*, config_file=None, section_name=None, all_patch_size=(2, ), all_f_patch_size=(1, ), in_channels=16, dim=3840, n_layers=30, n_refiner_layers=2, n_heads=30, n_kv_heads=30, norm_eps=1e-05, qk_norm=True, cap_feat_dim=2560, rope_theta=256.0, t_scale=1000.0, axes_dims=(32, 48, 48), axes_lens=(1024, 512, 512), dtype=bfloat16, device=<factory>)

source

Bases: MAXModelConfigBase

Parameters:

all_f_patch_size​

all_f_patch_size: tuple[int, ...]

source

all_patch_size​

all_patch_size: tuple[int, ...]

source

axes_dims​

axes_dims: tuple[int, ...]

source

axes_lens​

axes_lens: tuple[int, ...]

source

cap_feat_dim​

cap_feat_dim: int

source

device​

device: DeviceRef

source

dim​

dim: int

source

dtype​

dtype: DType

source

fbcache_dims()​

fbcache_dims()

source

(hidden_dim, output_dim) per image token for FBCache / Taylor tensors.

Return type:

tuple[int, int]

in_channels​

in_channels: int

source

initialize_from_config()​

classmethod initialize_from_config(config_dict, encoding, devices)

source

Parameters:

  • config_dict (dict[str, Any])
  • encoding (Literal['float32', 'bfloat16', 'q4_k', 'q4_0', 'q6_k', 'float8_e4m3fn', 'float4_e2m1fnx2', 'gptq'])
  • devices (list[Device])

Return type:

Self

model_config​

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'strict': False}

source

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_heads​

n_heads: int

source

n_kv_heads​

n_kv_heads: int

source

n_layers​

n_layers: int

source

n_refiner_layers​

n_refiner_layers: int

source

norm_eps​

norm_eps: float

source

qk_norm​

qk_norm: bool

source

rope_theta​

rope_theta: float

source

t_scale​

t_scale: float

source

ZImageTransformerModel​

class max.pipelines.architectures.z_image_modulev3.ZImageTransformerModel(config, encoding, devices, weights, *, cache_config=None)

source

Bases: ComponentModel

Component wrapper for the compiled Z-Image transformer graph.

Parameters:

load_model()​

load_model()

source

Load and return a runtime model instance.

Return type:

None

model​

model: Callable[[...], Any]

source