Model
Recipes
Browse
Docs
GitHub
Providers
Arcee AI
Ernie (Baidu)
Seed (ByteDance)
DeepSeek
Google
inclusionAI
InternLM
JetBrains
Jina AI
LongCat (Meituan)
Meta
Llama-4-Scout-17B-16E-Instruct
Llama-3.3-70B-Instruct
Llama-3.1-8B-Instruct
Microsoft
MiniMax
Mistral AI
Moonshot AI
NVIDIA
OpenAI
MiniCPM (OpenBMB)
InternVL (OpenGVLab)
PaddlePaddle
Preferred Networks
Poolside
Qwen
Stability AI
StepFun
Hunyuan (Tencent)
Wan (Alibaba)
MiMo (Xiaomi)
GLM (Z-AI)
meta-llama/
Llama-4-Scout-17B-16E-Instruct
Llama 4 Scout 17B-16E MoE model with NVIDIA FP8/FP4 variants, fits on a single GPU with quantization
View on HuggingFace
View on ModelScope
moe
109B / 17B
10,485,760 ctx
vLLM 0.12.0+
text
Guide