BertModel | sentence-transformers/all-MiniLM-L6-v2,
sentence-transformers/all-MiniLM-L12-v2 | text-to-embeddings | bfloat16, float32 | No |
DeepseekV2ForCausalLM | deepseek-ai/DeepSeek-V2-Lite-Chat | text-to-text | bfloat16 | Yes |
DeepseekV32ForCausalLM | deepseek-ai/DeepSeek-V3.2,
deepseek-ai/DeepSeek-V3.2-Exp | text-to-text | float8_e4m3fn | Yes |
DeepseekV3ForCausalLM | deepseek-ai/DeepSeek-V3 | text-to-text | bfloat16, float4_e2m1fnx2, float8_e4m3fn | Yes |
ExaoneForCausalLM | LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct,
LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct,
LGAI-EXAONE/EXAONE-3.5-32B-Instruct | text-to-text | bfloat16, float32, q6_k | No |
Flux2KleinPipeline | black-forest-labs/FLUX.2-klein-4B,
black-forest-labs/FLUX.2-klein-9B,
black-forest-labs/FLUX.2-klein-base-4B,
black-forest-labs/FLUX.2-klein-base-9B | image-to-image, text-to-image | bfloat16 | No |
Flux2Pipeline | black-forest-labs/FLUX.2-dev,
black-forest-labs/FLUX.2-dev-NVFP4 | image-to-image, text-to-image | bfloat16, float4_e2m1fnx2 | No |
FluxPipeline | black-forest-labs/FLUX.1-dev,
black-forest-labs/FLUX.1-schnell | text-to-image | bfloat16 | No |
Gemma3ForCausalLM | google/gemma-3-1b-it,
google/gemma-3-1b-pt | text-to-text | bfloat16 | Yes |
Gemma3ForConditionalGeneration | google/gemma-3-4b-it,
google/gemma-3-4b-pt,
google/gemma-3-12b-it,
google/gemma-3-12b-pt,
google/gemma-3-27b-it,
google/gemma-3-27b-pt | image-to-text, text-to-text | bfloat16, float8_e4m3fn | Yes |
Gemma4ForConditionalGeneration | google/gemma-4-31B-it | image-to-text, text-to-text, video-to-text | bfloat16 | No |
GptOssForCausalLM | openai/gpt-oss-20b,
openai/gpt-oss-120b,
unsloth/gpt-oss-20b-BF16 | text-to-text | bfloat16, float4_e2m1fnx2 | Yes |
GraniteForCausalLM | ibm-granite/granite-3.1-8b-instruct,
ibm-granite/granite-3.1-8b-base | text-to-text | bfloat16, float32 | No |
Idefics3ForConditionalGeneration | HuggingFaceM4/Idefics3-8B-Llama3 | image-to-text, text-to-text | bfloat16 | No |
InternVLChatModel | OpenGVLab/InternVL3-8B-Instruct | image-to-text, text-to-text | bfloat16 | Yes |
KimiK25ForConditionalGeneration | moonshotai/Kimi-K2.5,
nvidia/Kimi-K2.5-NVFP4 | image-to-text, text-to-text | bfloat16, float4_e2m1fnx2, float8_e4m3fn | Yes |
KimiVLForConditionalGeneration | moonshotai/Kimi-VL-A3B-Instruct | image-to-text, text-to-text | bfloat16, float4_e2m1fnx2, float8_e4m3fn | Yes |
LlamaForCausalLM | meta-llama/Llama-3.1-8B-Instruct,
deepseek-ai/DeepSeek-R1-Distill-Llama-8B,
meta-llama/Llama-Guard-3-8B,
meta-llama/Llama-3.2-1B-Instruct,
meta-llama/Llama-3.2-3B-Instruct,
deepseek-ai/deepseek-coder-6.7b-instruct,
modularai/Llama-3.1-8B-Instruct-GGUF | text-to-text | bfloat16, float32, float4_e2m1fnx2, float8_e4m3fn, gptq, q6_k | Yes |
LlavaForConditionalGeneration | mistral-community/pixtral-12b | image-to-text, text-to-text | bfloat16 | No |
MambaForCausalLM | state-spaces/mamba-130m-hf | text-to-text | bfloat16, float32 | No |
Mistral3ForConditionalGeneration | mistralai/Mistral-Small-3.1-24B-Instruct-2503 | text-to-text | bfloat16 | Yes |
MistralForCausalLM | mistralai/Mistral-Nemo-Instruct-2407 | text-to-text | bfloat16 | Yes |
MPNetForMaskedLM | sentence-transformers/all-mpnet-base-v2 | text-to-embeddings | bfloat16, float32 | No |
Olmo2ForCausalLM | allenai/OLMo-2-0425-1B-Instruct,
allenai/OLMo-2-1124-7B,
allenai/OLMo-2-1124-13B-Instruct,
allenai/OLMo-2-0325-32B-Instruct,
allenai/OLMo-2-1124-7B-GGUF | text-to-text | bfloat16, float32 | No |
Olmo3ForCausalLM | allenai/Olmo-3-7B-Instruct | text-to-text | bfloat16 | No |
OlmoForCausalLM | allenai/OLMo-1B-hf,
allenai/OLMo-1B-0724-hf | text-to-text | bfloat16, float32 | No |
Phi3ForCausalLM | microsoft/phi-4,
microsoft/Phi-3.5-mini-instruct | text-to-text | bfloat16, float32 | No |
Qwen2_5_VLForConditionalGeneration | Qwen/Qwen2.5-VL-3B-Instruct,
Qwen/Qwen2.5-VL-7B-Instruct | image-to-text, text-to-text | bfloat16, float32, float8_e4m3fn | Yes |
Qwen2ForCausalLM | Qwen/Qwen2.5-7B-Instruct,
Qwen/QwQ-32B | text-to-text | bfloat16, float32 | Yes |
Qwen3ForCausalLM | Qwen/Qwen3-8B,
Qwen/Qwen3-30B-A3B,
Qwen/Qwen3-Embedding-0.6B,
Qwen/Qwen3-Embedding-4B,
Qwen/Qwen3-Embedding-8B | text-to-embeddings, text-to-text | bfloat16, float32, float8_e4m3fn | Yes |
Qwen3MoeForCausalLM | Qwen/Qwen3-30B-A3B-Instruct,
Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 | text-to-text | bfloat16, float32, float8_e4m3fn | Yes |
Qwen3VLForConditionalGeneration | Qwen/Qwen3-VL-4B-Instruct,
Qwen/Qwen3-VL-2B-Instruct | image-to-text, text-to-text | bfloat16, float32, float8_e4m3fn | Yes |
Qwen3VLMoeForConditionalGeneration | Qwen/Qwen3-VL-30B-A3B-Instruct | image-to-text, text-to-text | bfloat16, float32, float8_e4m3fn | Yes |
QwenImageEditPipeline | Qwen/Qwen-Image-Edit-2511 | image-to-image, text-to-image | bfloat16 | No |
QwenImageEditPlusPipeline | Qwen/Qwen-Image-Edit-2511 | image-to-image, text-to-image | bfloat16 | No |
QwenImagePipeline | Qwen/Qwen-Image-2512 | text-to-image | bfloat16 | No |
UnifiedMTPDeepseekV3ForCausalLM | deepseek-ai/DeepSeek-V3 | text-to-text | bfloat16, float4_e2m1fnx2, float8_e4m3fn | Yes |
WanImageToVideoPipeline | Wan-AI/Wan2.2-I2V-A14B-Diffusers,
Wan-AI/Wan2.1-I2V-14B-720P-Diffusers | text-to-image | bfloat16, float32 | No |
WanPipeline | Wan-AI/Wan2.2-T2V-A14B-Diffusers,
Wan-AI/Wan2.1-T2V-14B-Diffusers,
Wan-AI/Wan2.2-TI2V-5B-Diffusers,
yetter-ai/Wan2.2-TI2V-5B-Turbo-Diffusers | text-to-image | bfloat16, float32 | No |
ZImagePipeline | Tongyi-MAI/Z-Image,
Zyphra/Z-Image | text-to-image | bfloat16 | No |