Model Recipes
JetBrains

JetBrains/Mellum2-12B-A2.5B-Instruct

JetBrains' instruction-tuned code MoE (12B total / 2.5B active) that answers directly without an externalized chain of thought — low-latency coding and tool use

78.4 EvalPlus, 67.1 MultiPL-E — direct answers, fits on a single GPU

moe12B / 2.5B131,072 ctxvLLM nightly+text
Guide