IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python module

max.pipelines.architectures.qwen_image

Qwen-Image diffusion architecture for image generation.

QwenImageArchConfig​

class max.pipelines.architectures.qwen_image.QwenImageArchConfig(*, pipeline_config)

source

Bases: ArchConfig

Pipeline-level config for QwenImage (implements ArchConfig; no KV cache).

Parameters:

pipeline_config (PipelineConfig)

get_max_seq_len()​

get_max_seq_len()

source

Returns the default maximum sequence length for the model.

Subclasses should determine whether this value can be overridden by setting the --max-length (pipeline_config.model.max_length) flag.

Return type:

int

initialize()​

classmethod initialize(pipeline_config, model_config=None)

source

Initialize the config from a PipelineConfig.

Parameters:

  • pipeline_config (PipelineConfig) – The pipeline configuration.
  • model_config (MAXModelConfig | None) – The model configuration to read from. When None (the default), pipeline_config.model is used. Pass an explicit config (e.g. pipeline_config.draft_model) to initialize the arch config for a different model.

Return type:

Self

pipeline_config​

pipeline_config: PipelineConfig

source

QwenImageConfig​

class max.pipelines.architectures.qwen_image.QwenImageConfig(*, config_file=None, section_name=None, patch_size=2, in_channels=64, out_channels=None, num_layers=60, attention_head_dim=128, num_attention_heads=24, joint_attention_dim=3584, guidance_embeds=False, axes_dims_rope=(16, 56, 56), rope_theta=10000, zero_cond_t=False, eps=1e-06, dtype=bfloat16, device=<factory>)

source

Bases: QwenImageConfigBase

Parameters:

  • config_file (str | None)
  • section_name (str | None)
  • patch_size (int)
  • in_channels (int)
  • out_channels (int | None)
  • num_layers (int)
  • attention_head_dim (int)
  • num_attention_heads (int)
  • joint_attention_dim (int)
  • guidance_embeds (bool)
  • axes_dims_rope (tuple[int, ...])
  • rope_theta (int)
  • zero_cond_t (bool)
  • eps (float)
  • dtype (DType)
  • device (DeviceRef)

generate()​

static generate(config_dict, encoding, devices)

source

Parameters:

  • config_dict (dict[str, Any])
  • encoding (Literal['float32', 'bfloat16', 'q4_k', 'q4_0', 'q6_k', 'float8_e4m3fn', 'float4_e2m1fnx2', 'gptq'])
  • devices (list[Device])

Return type:

QwenImageConfigBase

model_config​

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'strict': False}

source

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

QwenImageConfigBase​

class max.pipelines.architectures.qwen_image.QwenImageConfigBase(*, config_file=None, section_name=None, patch_size=2, in_channels=64, out_channels=None, num_layers=60, attention_head_dim=128, num_attention_heads=24, joint_attention_dim=3584, guidance_embeds=False, axes_dims_rope=(16, 56, 56), rope_theta=10000, zero_cond_t=False, eps=1e-06, dtype=bfloat16, device=<factory>)

source

Bases: MAXModelConfigBase

Parameters:

  • config_file (str | None)
  • section_name (str | None)
  • patch_size (int)
  • in_channels (int)
  • out_channels (int | None)
  • num_layers (int)
  • attention_head_dim (int)
  • num_attention_heads (int)
  • joint_attention_dim (int)
  • guidance_embeds (bool)
  • axes_dims_rope (tuple[int, ...])
  • rope_theta (int)
  • zero_cond_t (bool)
  • eps (float)
  • dtype (DType)
  • device (DeviceRef)

attention_head_dim​

attention_head_dim: int

source

axes_dims_rope​

axes_dims_rope: tuple[int, ...]

source

device​

device: DeviceRef

source

dtype​

dtype: DType

source

eps​

eps: float

source

guidance_embeds​

guidance_embeds: bool

source

in_channels​

in_channels: int

source

joint_attention_dim​

joint_attention_dim: int

source

model_config​

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'strict': False}

source

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_attention_heads​

num_attention_heads: int

source

num_layers​

num_layers: int

source

out_channels​

out_channels: int | None

source

patch_size​

patch_size: int

source

rope_theta​

rope_theta: int

source

zero_cond_t​

zero_cond_t: bool

source

QwenImageTransformerModel​

class max.pipelines.architectures.qwen_image.QwenImageTransformerModel(config, encoding, devices, weights, session)

source

Bases: ComponentModel

Parameters:

load_model()​

load_model()

source

Load and return a runtime model instance.

Return type:

Callable[[…], Any]