deepseek-ai/DeepSeek-V4-Flash
DeepSeek V4 MoE model with hybrid CSA+HCA attention, manifold-constrained hyper-connections, and three-tier reasoning (Non-think / Think High / Think Max).
Compact 284B/13B V4 sibling — single-node 1M-context serving with FP4+FP8 weights and MTP