> For the complete documentation index, see [llms.txt](https://docs.modular.com/llms.txt). > Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md). # Supported models The table below lists all the model architectures currently supported by MAX. Each model architecture represents a family of different models, as defined by Hugging Face Transformers. The example model names are Hugging Face repository IDs, such as `google/gemma-3-27b-it` for the `Gemma3ForCausalLM` architecture, but you can use any model from Hugging Face that's based on an architecture below. To deploy any of these models with MAX, pass the model name to the [`max serve`](https://docs.modular.com/max/cli/serve.md) or [`docker run`](https://docs.modular.com/max/container.md) command. Try it now by following the [MAX quickstart guide](https://docs.modular.com/max/get-started.md). Or if you want to serve a custom model, see the tutorial to [serve custom model architectures](https://docs.modular.com/max/develop/serve-custom-model-architectures.md). You can also see the [model source code in GitHub](https://github.com/modular/modular/tree/main/max/python/max/pipelines/architectures).

Architecture	Example models (repo IDs)	Modality	Encodings	Multi-GPU
`BertModel`	sentence-transformers/all-MiniLM-L6-v2, sentence-transformers/all-MiniLM-L12-v2	text-to-embeddings	bfloat16, float32	No
`DeepseekV2ForCausalLM`	deepseek-ai/DeepSeek-V2-Lite-Chat	text-to-text	bfloat16	Yes
`DeepseekV32ForCausalLM`	deepseek-ai/DeepSeek-V3.2, deepseek-ai/DeepSeek-V3.2-Exp	text-to-text	float8_e4m3fn	Yes
`DeepseekV3ForCausalLM`	deepseek-ai/DeepSeek-V3	text-to-text	bfloat16, float4_e2m1fnx2, float8_e4m3fn	Yes
`DFlashDraftModel`	z-lab/LLaMA3.1-8B-Instruct-DFlash-UltraChat	text-to-text	bfloat16, float32	Yes
`ExaoneForCausalLM`	LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct, LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct, LGAI-EXAONE/EXAONE-3.5-32B-Instruct	text-to-text	bfloat16, float32, q6_k	No
`Flux2KleinPipeline`	black-forest-labs/FLUX.2-klein-4B, black-forest-labs/FLUX.2-klein-9B, black-forest-labs/FLUX.2-klein-base-4B, black-forest-labs/FLUX.2-klein-base-9B, black-forest-labs/FLUX.2-klein-4b-nvfp4, black-forest-labs/FLUX.2-klein-9b-nvfp4	image-to-image, text-to-image	bfloat16, float4_e2m1fnx2	No
`Flux2Pipeline`	black-forest-labs/FLUX.2-dev, black-forest-labs/FLUX.2-dev-NVFP4	image-to-image, text-to-image	bfloat16, float4_e2m1fnx2	No
`Gemma3ForCausalLM`	google/gemma-3-1b-it, google/gemma-3-1b-pt	text-to-text	bfloat16	Yes
`Gemma3ForConditionalGeneration`	google/gemma-3-4b-it, google/gemma-3-4b-pt, google/gemma-3-12b-it, google/gemma-3-12b-pt, google/gemma-3-27b-it, google/gemma-3-27b-pt	image-to-text, text-to-text	bfloat16, float8_e4m3fn	Yes
`Gemma4ForConditionalGeneration`	google/gemma-4-31B-it, nvidia/Gemma-4-31B-IT-NVFP4	image-to-text, text-to-text, video-to-text	bfloat16, float4_e2m1fnx2	Yes
`GlmMoeDsaForCausalLM`	zai-org/GLM-5.1, zai-org/GLM-5.1-FP8, zai-org/GLM-5	text-to-text	bfloat16, float4_e2m1fnx2, float8_e4m3fn	Yes
`GptOssForCausalLM`	openai/gpt-oss-20b, openai/gpt-oss-120b, unsloth/gpt-oss-20b-BF16	text-to-text	bfloat16, float4_e2m1fnx2	Yes
`GraniteForCausalLM`	ibm-granite/granite-3.1-8b-instruct, ibm-granite/granite-3.1-8b-base	text-to-text	bfloat16, float32	No
`HYV3ForCausalLM`	tencent/Hy3-preview	text-to-text	bfloat16	Yes
`Idefics3ForConditionalGeneration`	HuggingFaceM4/Idefics3-8B-Llama3	image-to-text, text-to-text	bfloat16	No
`InternVLChatModel`	OpenGVLab/InternVL3-8B-Instruct	image-to-text, text-to-text	bfloat16	Yes
`KimiK25ForConditionalGeneration`	nvidia/Kimi-K2.5-NVFP4, nvidia/Kimi-K2.6-NVFP4	image-to-text, text-to-text	bfloat16, float4_e2m1fnx2, float8_e4m3fn	Yes
`KimiVLForConditionalGeneration`	moonshotai/Kimi-VL-A3B-Instruct	image-to-text, text-to-text	bfloat16, float4_e2m1fnx2, float8_e4m3fn	Yes
`Lfm2ForCausalLM`	LiquidAI/LFM2.5-350M, LiquidAI/LFM2.5-350M-Base	text-to-text	bfloat16, float32	No
`LlamaForCausalLM`	meta-llama/Llama-3.1-8B-Instruct, deepseek-ai/DeepSeek-R1-Distill-Llama-8B, meta-llama/Llama-Guard-3-8B, meta-llama/Llama-3.2-1B-Instruct, meta-llama/Llama-3.2-3B-Instruct, deepseek-ai/deepseek-coder-6.7b-instruct, modularai/Llama-3.1-8B-Instruct-GGUF	text-to-text	bfloat16, float32, float4_e2m1fnx2, float8_e4m3fn, gptq, q6_k	Yes
`LlavaForConditionalGeneration`	mistral-experimental/pixtral-12b	image-to-text, text-to-text	bfloat16	No
`MambaForCausalLM`	state-spaces/mamba-130m-hf	text-to-text	bfloat16, float32	No
`MiniMaxM2ForCausalLM`	MiniMaxAI/MiniMax-M2.7, MiniMaxAI/MiniMax-M2.5, lukealonso/MiniMax-M2.7-NVFP4, amd/MiniMax-M2.7-MXFP4	text-to-text	float4_e2m1fnx2, float8_e4m3fn	Yes
`Mistral3ForConditionalGeneration`	mistralai/Mistral-Small-3.1-24B-Instruct-2503	text-to-text	bfloat16	Yes
`MistralForCausalLM`	mistralai/Mistral-Nemo-Instruct-2407	text-to-text	bfloat16	Yes
`MPNetForMaskedLM`	sentence-transformers/all-mpnet-base-v2	text-to-embeddings	bfloat16, float32	No
`Olmo2ForCausalLM`	allenai/OLMo-2-0425-1B-Instruct, allenai/OLMo-2-1124-7B, allenai/OLMo-2-1124-13B-Instruct, allenai/OLMo-2-0325-32B-Instruct, allenai/OLMo-2-1124-7B-GGUF	text-to-text	bfloat16, float32	No
`Olmo3ForCausalLM`	allenai/Olmo-3-7B-Instruct	text-to-text	bfloat16	No
`OlmoForCausalLM`	allenai/OLMo-1B-hf, allenai/OLMo-1B-0724-hf	text-to-text	bfloat16, float32	No
`Phi3ForCausalLM`	microsoft/phi-4, microsoft/Phi-3.5-mini-instruct	text-to-text	bfloat16, float32	No
`Qwen2_5_VLForConditionalGeneration`	Qwen/Qwen2.5-VL-3B-Instruct, Qwen/Qwen2.5-VL-7B-Instruct	image-to-text, text-to-text	bfloat16, float32, float8_e4m3fn	Yes
`Qwen2ForCausalLM`	Qwen/Qwen2.5-7B-Instruct, Qwen/QwQ-32B	text-to-text	bfloat16, float32	Yes
`Qwen3_5ForConditionalGeneration`	Qwen/Qwen3.5-27B	text-to-text	bfloat16, float32	No
`Qwen3ForCausalLM`	Qwen/Qwen3-8B, Qwen/Qwen3-30B-A3B, Qwen/Qwen3-Embedding-0.6B, Qwen/Qwen3-Embedding-4B, Qwen/Qwen3-Embedding-8B	text-to-embeddings, text-to-text	bfloat16, float32, float8_e4m3fn	Yes
`Qwen3MoeForCausalLM`	Qwen/Qwen3-30B-A3B-Instruct, Qwen/Qwen3-30B-A3B-Instruct-2507-FP8	text-to-text	bfloat16, float32, float8_e4m3fn	Yes
`Qwen3VLForConditionalGeneration`	Qwen/Qwen3-VL-4B-Instruct, Qwen/Qwen3-VL-2B-Instruct	image-to-text, text-to-text	bfloat16, float32, float8_e4m3fn	Yes
`Qwen3VLMoeForConditionalGeneration`	Qwen/Qwen3-VL-30B-A3B-Instruct	image-to-text, text-to-text	bfloat16, float32, float8_e4m3fn	Yes
`QwenImageEditPipeline`	Qwen/Qwen-Image-Edit-2511	image-to-image, text-to-image	bfloat16	No
`QwenImageEditPlusPipeline`	Qwen/Qwen-Image-Edit-2511	image-to-image, text-to-image	bfloat16	No
`QwenImagePipeline`	Qwen/Qwen-Image-2512	text-to-image	bfloat16	No
`Step3p5ForCausalLM`	stepfun-ai/Step-3.5-Flash	text-to-text	bfloat16	Yes
`UnifiedDflashKimiK25ForCausalLM`	nvidia/Kimi-K2.5-NVFP4	image-to-text, text-to-text	bfloat16, float4_e2m1fnx2, float8_e4m3fn	Yes
`UnifiedDflashLlama3ForCausalLM`	meta-llama/Llama-3.2-3B-Instruct	text-to-text	bfloat16, float32	No
`UnifiedMTPDeepseekV3ForCausalLM`	deepseek-ai/DeepSeek-V3	text-to-text	bfloat16, float4_e2m1fnx2, float8_e4m3fn	Yes
`WanImageToVideoPipeline`	Wan-AI/Wan2.2-I2V-A14B-Diffusers, Wan-AI/Wan2.1-I2V-14B-720P-Diffusers	text-to-image	bfloat16, float32	No
`WanPipeline`	Wan-AI/Wan2.2-T2V-A14B-Diffusers, Wan-AI/Wan2.1-T2V-14B-Diffusers, Wan-AI/Wan2.2-TI2V-5B-Diffusers, yetter-ai/Wan2.2-TI2V-5B-Turbo-Diffusers	text-to-image	bfloat16, float32	No
`ZImagePipeline`	Tongyi-MAI/Z-Image, Zyphra/Z-Image	text-to-image	bfloat16	No