Skip to main content

Supported models

The table below lists all the model architectures currently supported by MAX.

Each model architecture represents a family of different models, as defined by Hugging Face Transformers. The example model names are Hugging Face repository IDs, such as google/gemma-3-27b-it for the Gemma3ForCausalLM architecture, but you can use any model from Hugging Face that's based on an architecture below.

To deploy any of these models with MAX, pass the model name to the max serve or docker run command. Try it now by following the MAX quickstart guide. Or if you want to serve a custom model, see the tutorial to serve custom model architectures.

You can also see the model source code in GitHub.

ArchitectureExample models (repo IDs)ModalityEncodingsMulti-GPU
BertModel

sentence-transformers/all-MiniLM-L6-v2,
sentence-transformers/all-MiniLM-L12-v2

text-to-embeddingsbfloat16, float32No
DeepseekV2ForCausalLMdeepseek-ai/DeepSeek-V2-Lite-Chattext-to-textbfloat16Yes
DeepseekV32ForCausalLM

deepseek-ai/DeepSeek-V3.2,
deepseek-ai/DeepSeek-V3.2-Exp

text-to-textfloat8_e4m3fnYes
DeepseekV3ForCausalLMdeepseek-ai/DeepSeek-V3text-to-textbfloat16, float4_e2m1fnx2, float8_e4m3fnYes
ExaoneForCausalLM

LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct,
LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct,
LGAI-EXAONE/EXAONE-3.5-32B-Instruct

text-to-textbfloat16, float32, q6_kNo
Flux2KleinPipeline

black-forest-labs/FLUX.2-klein-4B,
black-forest-labs/FLUX.2-klein-9B,
black-forest-labs/FLUX.2-klein-base-4B,
black-forest-labs/FLUX.2-klein-base-9B

text-to-imagebfloat16No
Flux2Pipelineblack-forest-labs/FLUX.2-devtext-to-imagebfloat16No
FluxPipeline

black-forest-labs/FLUX.1-dev,
black-forest-labs/FLUX.1-schnell

text-to-imagebfloat16No
Gemma3ForCausalLM

google/gemma-3-1b-it,
google/gemma-3-1b-pt

text-to-textbfloat16Yes
Gemma3ForConditionalGeneration

google/gemma-3-4b-it,
google/gemma-3-4b-pt,
google/gemma-3-12b-it,
google/gemma-3-12b-pt,
google/gemma-3-27b-it,
google/gemma-3-27b-pt

image-to-text,
text-to-text
bfloat16, float8_e4m3fnYes
GptOssForCausalLM

openai/gpt-oss-20b,
openai/gpt-oss-120b,
unsloth/gpt-oss-20b-BF16

text-to-textbfloat16, float4_e2m1fnx2Yes
GraniteForCausalLM

ibm-granite/granite-3.1-8b-instruct,
ibm-granite/granite-3.1-8b-base

text-to-textbfloat16, float32No
Idefics3ForConditionalGenerationHuggingFaceM4/Idefics3-8B-Llama3image-to-text,
text-to-text
bfloat16No
InternVLChatModelOpenGVLab/InternVL3-8B-Instructimage-to-text,
text-to-text
bfloat16Yes
KimiK25ForConditionalGeneration

moonshotai/Kimi-K2.5,
nvidia/Kimi-K2.5-NVFP4

image-to-text,
text-to-text
bfloat16, float4_e2m1fnx2, float8_e4m3fnYes
KimiVLForConditionalGenerationmoonshotai/Kimi-VL-A3B-Instructimage-to-text,
text-to-text
bfloat16, float4_e2m1fnx2, float8_e4m3fnYes
LlamaForCausalLM

meta-llama/Llama-3.1-8B-Instruct,
deepseek-ai/DeepSeek-R1-Distill-Llama-8B,
meta-llama/Llama-Guard-3-8B,
meta-llama/Llama-3.2-1B-Instruct,
meta-llama/Llama-3.2-3B-Instruct,
deepseek-ai/deepseek-coder-6.7b-instruct,
modularai/Llama-3.1-8B-Instruct-GGUF

text-to-textbfloat16, float32, float4_e2m1fnx2, float8_e4m3fn, gptq, q6_kYes
LlavaForConditionalGenerationmistral-community/pixtral-12bimage-to-text,
text-to-text
bfloat16No
Mistral3ForConditionalGenerationmistralai/Mistral-Small-3.1-24B-Instruct-2503text-to-textbfloat16Yes
MistralForCausalLMmistralai/Mistral-Nemo-Instruct-2407text-to-textbfloat16Yes
MPNetForMaskedLMsentence-transformers/all-mpnet-base-v2text-to-embeddingsbfloat16, float32No
Olmo2ForCausalLM

allenai/OLMo-2-0425-1B-Instruct,
allenai/OLMo-2-1124-7B,
allenai/OLMo-2-1124-13B-Instruct,
allenai/OLMo-2-0325-32B-Instruct,
allenai/OLMo-2-1124-7B-GGUF

text-to-textbfloat16, float32No
Olmo3ForCausalLMallenai/Olmo-3-7B-Instructtext-to-textbfloat16No
OlmoForCausalLM

allenai/OLMo-1B-hf,
allenai/OLMo-1B-0724-hf

text-to-textbfloat16, float32No
Phi3ForCausalLM

microsoft/phi-4,
microsoft/Phi-3.5-mini-instruct

text-to-textbfloat16, float32No
Qwen2_5_VLForConditionalGeneration

Qwen/Qwen2.5-VL-3B-Instruct,
Qwen/Qwen2.5-VL-7B-Instruct

image-to-text,
text-to-text
bfloat16, float32, float8_e4m3fnYes
Qwen2ForCausalLM

Qwen/Qwen2.5-7B-Instruct,
Qwen/QwQ-32B

text-to-textbfloat16, float32Yes
Qwen3ForCausalLM

Qwen/Qwen3-8B,
Qwen/Qwen3-30B-A3B,
Qwen/Qwen3-Embedding-0.6B,
Qwen/Qwen3-Embedding-4B,
Qwen/Qwen3-Embedding-8B

text-to-embeddings,
text-to-text
bfloat16, float32, float8_e4m3fnYes
Qwen3MoeForCausalLM

Qwen/Qwen3-30B-A3B-Instruct,
Qwen/Qwen3-30B-A3B-Instruct-2507-FP8

text-to-textbfloat16, float32, float8_e4m3fnYes
Qwen3VLForConditionalGeneration

Qwen/Qwen3-VL-4B-Instruct,
Qwen/Qwen3-VL-2B-Instruct

image-to-text,
text-to-text
bfloat16, float32, float8_e4m3fnYes
Qwen3VLMoeForConditionalGenerationQwen/Qwen3-VL-30B-A3B-Instructimage-to-text,
text-to-text
bfloat16, float32, float8_e4m3fnYes

Was this page helpful?