deepseek-ai/DeepSeek-V3.2-Exp
Experimental DeepSeek-V3.2 preview with sparse attention (MQA-like logits) and FP8 KV cache; architecture matches DeepSeek-V3.1 except for the sparse attention mechanism.
Sparse attention MoE with FP8 KV cache and strong GSM8K score (~0.96)