Model Recipes
Qwen

Qwen/Qwen3.6-35B-A3B

Smaller Qwen3.6 multimodal MoE model (35B total / 3B active) with BF16, FP8, and NVIDIA NVFP4 variants

Compact Qwen3.6 MoE with 3B active parameters — single-GPU FP8 or 2-4 GPU BF16 serving

moe35B / 3B262,144 ctxvLLM 0.17.0+multimodaltext
Guide