JetBrains/Mellum2-12B-A2.5B-Instruct
JetBrains' instruction-tuned code MoE (12B total / 2.5B active) that answers directly without an externalized chain of thought — low-latency coding and tool use
78.4 EvalPlus, 67.1 MultiPL-E — direct answers, fits on a single GPU