For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python module

max.pipelines.architectures.mistral3

Mistral 3 vision-language architecture for multimodal text generation.

`Mistral3Config`

class max.pipelines.architectures.mistral3.Mistral3Config(*, hidden_size, num_attention_heads, num_key_value_heads, num_hidden_layers, head_dim, vocab_size, rope_theta, max_seq_len, rms_norm_eps, feed_forward_length, dtype, kv_params, attention_multiplier, devices, return_logits=ReturnLogits.LAST_TOKEN)

source

Bases: MistralConfig

Configuration for Mistral3 models.

Parameters:

hidden_size (int)
num_attention_heads (int)
num_key_value_heads (int)
num_hidden_layers (int)
head_dim (int)
vocab_size (int)
rope_theta (float)
max_seq_len (int)
rms_norm_eps (float)
feed_forward_length (int)
dtype (DType)
kv_params (KVCacheParams)
attention_multiplier (float)
devices (list[DeviceRef])
return_logits (ReturnLogits)

`initialize()`

classmethod initialize(pipeline_config, model_config=None)

source

Initializes a MistralConfig instance from pipeline configuration.

This method creates a config instance with all fields that can be determined from the pipeline configuration.

Parameters:

pipeline_config (PipelineConfig) – The MAX Engine pipeline configuration.
model_config (MAXModelConfig | None)

Returns:

An initialized MistralConfig instance.

Return type:

Self

`Mistral3Model`

class max.pipelines.architectures.mistral3.Mistral3Model(pipeline_config, session, devices, kv_cache_config, weights, adapter=None, return_logits=ReturnLogits.LAST_TOKEN, max_batch_size=1)

source

Bases: MistralModel

Text-only Mistral3 pipeline model implementation.

Parameters:

pipeline_config (PipelineConfig)
session (InferenceSession)
devices (list[Device])
kv_cache_config (KVCacheConfig)
weights (Weights)
adapter (WeightsAdapter | None)
return_logits (ReturnLogits)
max_batch_size (int)

`calculate_max_seq_len()`

classmethod calculate_max_seq_len(pipeline_config, huggingface_config)

source

Bounds max_length by max_position_embeddings (config is permissive).

Parameters:

pipeline_config (PipelineConfig)
huggingface_config (AutoConfig)

Return type:

int

`get_kv_params()`

classmethod get_kv_params(huggingface_config, pipeline_config, devices, kv_cache_config, cache_dtype)

source

Returns the KV cache params for the pipeline model.

Delegates to model_config_cls.construct_kv_params(...). Subclasses with custom KV behavior should override this method.

Parameters:

huggingface_config (AutoConfig)
pipeline_config (PipelineConfig)
devices (list[DeviceRef])
kv_cache_config (KVCacheConfig)
cache_dtype (DType)

Return type:

KVCacheParamInterface

`model_config_cls`

model_config_cls

source

alias of Mistral3Config

Mistral3Config​

initialize()​

Mistral3Model​

calculate_max_seq_len()​

get_kv_params()​

model_config_cls​

`Mistral3Config`

`initialize()`

`Mistral3Model`

`calculate_max_seq_len()`

`get_kv_params()`

`model_config_cls`