Model Recipes
Qwen

Qwen/Qwen3.5-35B-A3B

Compact Qwen3.5 multimodal MoE (35B total / 3B active) with gated delta networks, 256 experts, and 262K context

Compact Qwen3.5 MoE — single-GPU FP8 or 2x GPU BF16 serving

moe35B / 3B262,144 ctxvLLM 0.17.0+multimodaltext
Guide