Skip to main content

Python module

max.pipelines.architectures.olmo_modulev3

OLMo transformer architecture for text generation.

OlmoConfig

class max.pipelines.architectures.olmo_modulev3.OlmoConfig(*, hidden_size, num_attention_heads, num_key_value_heads, num_hidden_layers, rope_theta, rope_scaling_params, max_seq_len, intermediate_size, interleaved_rope_weights, vocab_size, dtype, kv_params, return_logits=ReturnLogits.LAST_TOKEN, norm_method='rms_norm', attention_bias=False, rms_norm_eps=None, tie_word_embeddings=False, stacked_mlp=False, stacked_qkv=False, attention_multiplier, embedding_multiplier, residual_multiplier, devices, clip_qkv=None, norm_elementwise_affine=False, longrope_scaling_params=None, logits_scaling=1.0, return_hidden_states=ReturnHiddenStates.NONE)

source

Bases: Llama3Config

Model configuration for Olmo graph construction/execution.

Parameters:

finalize()

finalize(huggingface_config, state_dict, return_logits, return_hidden_states=ReturnHiddenStates.NONE, norm_method='rms_norm', attention_bias=False)

source

Define parameters that can’t be determined just from the pipeline config.

Parameters:

Return type:

None

norm_elementwise_affine

norm_elementwise_affine: bool = False

source

OlmoModel

class max.pipelines.architectures.olmo_modulev3.OlmoModel(pipeline_config, session, devices, kv_cache_config, weights, adapter=None, return_logits=ReturnLogits.LAST_TOKEN, return_hidden_states=ReturnHiddenStates.NONE)

source

Bases: Llama3Model

Olmo pipeline model implementation.

Parameters:

config_class

config_class

source

alias of OlmoConfig

norm_method

norm_method: Literal['rms_norm'] | Literal['layer_norm'] = 'layer_norm'

source