IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python class

PixelContext

PixelContext​

class max.pipelines.context.PixelContext(*, tokens, request_id=<factory>, model_name='', mask=None, tokens_2=None, negative_tokens=None, negative_mask=None, negative_tokens_2=None, explicit_negative_prompt=False, timesteps=<factory>, sigmas=<factory>, latents=<factory>, latent_image_ids=<factory>, text_ids=<factory>, negative_text_ids=<factory>, height=1024, width=1024, num_inference_steps=50, guidance_scale=3.5, true_cfg_scale=1.0, strength=0.6, cfg_normalization=False, cfg_truncation=1.0, num_warmup_steps=0, num_images_per_prompt=1, input_image=None, input_images=None, prompt_images=None, vae_condition_images=None, output_format='jpeg', status=GenerationStatus.ACTIVE)

source

Bases: object

A model-ready context for image/video generation requests.

Per the design doc, this class contains only numeric data that the model will execute against. User-facing strings (prompt, negative_prompt) are consumed during tokenization and do not appear here.

All preprocessing is performed by PixelGenerationTokenizer.new_context():

  • Prompt tokenization -> tokens field
  • Negative prompt tokenization -> negative_tokens field
  • Timestep schedule computation -> timesteps field
  • Initial noise generation -> latents field

Parameters:

cfg_normalization​

cfg_normalization: bool = False

source

cfg_truncation​

cfg_truncation: float = 1.0

source

compute_num_available_steps()​

compute_num_available_steps(max_seq_len)

source

Compute number of available steps for scheduler compatibility.

For image and video generation, this returns the number of inference steps.

Parameters:

max_seq_len (int)

Return type:

int

explicit_negative_prompt​

explicit_negative_prompt: bool = False

source

Whether the request explicitly supplied a negative prompt.

guidance_scale​

guidance_scale: float = 3.5

source

height​

height: int = 1024

source

input_image​

input_image: ndarray[tuple[Any, ...], dtype[uint8]] | None = None

source

Input image as numpy array (H, W, C) in uint8 format for image-to-image generation.

input_images​

input_images: list[ndarray[tuple[Any, ...], dtype[uint8]]] | None = None

source

Input images as list of numpy arrays (H, W, C) in uint8 format for image-to-image generation.

is_done​

property is_done: bool

source

Whether the request has completed generation.

latent_image_ids​

latent_image_ids: ndarray[tuple[Any, ...], dtype[float32]]

source

Precomputed latent image IDs for generation.

latents​

latents: ndarray[tuple[Any, ...], dtype[float32]]

source

Precomputed initial noise (latents) for generation.

mask​

mask: ndarray[tuple[Any, ...], dtype[bool]] | None = None

source

Mask for text encoder’s attention.

model_name​

model_name: str = ''

source

negative_mask​

negative_mask: ndarray[tuple[Any, ...], dtype[bool]] | None = None

source

Mask for the negative text encoder path.

negative_text_ids​

negative_text_ids: ndarray[tuple[Any, ...], dtype[int64]]

source

Precomputed text position IDs for the negative prompt.

negative_tokens​

negative_tokens: TokenBuffer | None = None

source

Negative tokens for primary encoder.

negative_tokens_2​

negative_tokens_2: TokenBuffer | None = None

source

Negative tokens for secondary encoder. None for single-encoder models.

num_images_per_prompt​

num_images_per_prompt: int = 1

source

num_inference_steps​

num_inference_steps: int = 50

source

num_warmup_steps​

num_warmup_steps: int = 0

source

output_format​

output_format: str = 'jpeg'

source

Image encoding format for the output (e.g., β€˜jpeg’, β€˜png’, β€˜webp’).

prompt_images​

prompt_images: list[ndarray[tuple[Any, ...], dtype[uint8]]] | None = None

source

Optional prompt-conditioning images prepared by the tokenizer.

request_id​

request_id: RequestID

source

reset()​

reset()

source

Resets the context’s state.

Return type:

None

sigmas​

sigmas: ndarray[tuple[Any, ...], dtype[float32]]

source

Precomputed sigmas schedule for denoising.

status​

status: GenerationStatus = 'active'

source

strength​

strength: float = 0.6

source

text_ids​

text_ids: ndarray[tuple[Any, ...], dtype[int64]]

source

Precomputed text position IDs, shape (B, seq_len, 4) int64.

timesteps​

timesteps: ndarray[tuple[Any, ...], dtype[float32]]

source

Precomputed timesteps schedule for denoising.

to_generation_output()​

to_generation_output()

source

Convert this context to a GenerationOutput object.

Return type:

GenerationOutput

tokens​

tokens: TokenBuffer

source

Primary encoder tokens.

tokens_2​

tokens_2: TokenBuffer | None = None

source

Secondary encoder tokens. None for single-encoder models.

true_cfg_scale​

true_cfg_scale: float = 1.0

source

update()​

update(latents)

source

Update the context with newly generated latents/image data.

Parameters:

latents (ndarray[tuple[Any, ...], dtype[Any]])

Return type:

None

vae_condition_images​

vae_condition_images: list[ndarray[tuple[Any, ...], dtype[uint8]]] | None = None

source

Optional VAE-conditioning images prepared by the tokenizer.

Qwen image edit keeps prompt-conditioning images and VAE-conditioning images separate because the multimodal prompt encoder and the VAE latent conditioning path use different resize targets.

width​

width: int = 1024

source