Python module
max.pipelines.architectures.z_image_modulev3
Z-Image diffusion architecture for image generation.
ZImageArchConfigβ
class max.pipelines.architectures.z_image_modulev3.ZImageArchConfig(*, pipeline_config: 'PipelineConfig')
Bases: ArchConfig
-
Parameters:
-
pipeline_config (PipelineConfig)
get_max_seq_len()β
get_max_seq_len()
Returns the default maximum sequence length for the model.
Subclasses should determine whether this value can be overridden by
setting the --max-length (pipeline_config.model.max_length) flag.
-
Return type:
initialize()β
classmethod initialize(pipeline_config, model_config=None)
Initialize the config from a PipelineConfig.
-
Parameters:
-
- pipeline_config (PipelineConfig) β The pipeline configuration.
- model_config (MAXModelConfig | None) β The model configuration to read from. When
None(the default),pipeline_config.modelis used. Pass an explicit config (e.g.pipeline_config.draft_model) to initialize the arch config for a different model.
-
Return type:
pipeline_configβ
pipeline_config: PipelineConfig
ZImageConfigβ
class max.pipelines.architectures.z_image_modulev3.ZImageConfig(*, config_file=None, section_name=None, all_patch_size=(2, ), all_f_patch_size=(1, ), in_channels=16, dim=3840, n_layers=30, n_refiner_layers=2, n_heads=30, n_kv_heads=30, norm_eps=1e-05, qk_norm=True, cap_feat_dim=2560, rope_theta=256.0, t_scale=1000.0, axes_dims=(32, 48, 48), axes_lens=(1024, 512, 512), dtype=bfloat16, device=<factory>)
Bases: MAXModelConfigBase
-
Parameters:
-
- config_file (str | None)
- section_name (str | None)
- all_patch_size (tuple[int, ...])
- all_f_patch_size (tuple[int, ...])
- in_channels (int)
- dim (int)
- n_layers (int)
- n_refiner_layers (int)
- n_heads (int)
- n_kv_heads (int)
- norm_eps (float)
- qk_norm (bool)
- cap_feat_dim (int)
- rope_theta (float)
- t_scale (float)
- axes_dims (tuple[int, ...])
- axes_lens (tuple[int, ...])
- dtype (DType)
- device (DeviceRef)
all_f_patch_sizeβ
all_patch_sizeβ
axes_dimsβ
axes_lensβ
cap_feat_dimβ
cap_feat_dim: int
deviceβ
device: DeviceRef
dimβ
dim: int
dtypeβ
dtype: DType
fbcache_dims()β
fbcache_dims()
(hidden_dim, output_dim) per image token for FBCache / Taylor tensors.
in_channelsβ
in_channels: int
initialize_from_config()β
classmethod initialize_from_config(config_dict, encoding, devices)
model_configβ
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'strict': False}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
n_headsβ
n_heads: int
n_kv_headsβ
n_kv_heads: int
n_layersβ
n_layers: int
n_refiner_layersβ
n_refiner_layers: int
norm_epsβ
norm_eps: float
qk_normβ
qk_norm: bool
rope_thetaβ
rope_theta: float
t_scaleβ
t_scale: float
ZImageTransformerModelβ
class max.pipelines.architectures.z_image_modulev3.ZImageTransformerModel(config, encoding, devices, weights, *, cache_config=None)
Bases: ComponentModel
Component wrapper for the compiled Z-Image transformer graph.
-
Parameters:
load_model()β
load_model()
Load and return a runtime model instance.
-
Return type:
-
None
modelβ
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!