Qwen/Qwen3.5-397B-A17B
Multimodal MoE model with gated delta networks architecture, 397B total / 17B active parameters, up to 262K context
Verified on 8x H200, 8x MI300X/MI355X, and GB200 nodes
Multimodal MoE model with gated delta networks architecture, 397B total / 17B active parameters, up to 262K context
Verified on 8x H200, 8x MI300X/MI355X, and GB200 nodes