Model Recipes
Qwen

Qwen/Qwen3.5-4B

Qwen3.5 compact dense multimodal model (4B) — fits on 16 GB consumer GPUs with full 262K context

Consumer-GPU-friendly Qwen3.5 dense with MTP support

dense4B262,144 ctxvLLM 0.17.0+multimodaltext
Guide