Model Recipes
DeepSeek

deepseek-ai/DeepSeek-V4-Flash

DeepSeek V4 MoE model with hybrid CSA+HCA attention, manifold-constrained hyper-connections, and three-tier reasoning (Non-think / Think High / Think Max).

Compact 284B/13B V4 sibling — single-node 1M-context serving with FP4+FP8 weights and MTP

moe284B / 13B1,048,576 ctxvLLM 0.20.0+text
Guide