Model Recipes
OpenAI

openai/gpt-oss-20b

OpenAI's gpt-oss-20b — 21B-total / 3.6B-active MoE reasoning model with native MXFP4 quant; fits in 16GB VRAM

21B/3.6B-A MoE reasoning model with native MXFP4 — runs on 16GB

moe21B / 3.6B131,072 ctxvLLM 0.10.0+text
Guide