Model Recipes
InternLM

internlm/Intern-S2-Preview

Scientific multimodal MoE (36B total / 3B active) continued pre-trained from Qwen3.5 — hybrid linear/full attention, 262K context, MTP-accelerated reasoning. BF16 and FP8 checkpoints.

35B-A3B scientific multimodal foundation model — single-node BF16 with MTP

moe36B / 3B262,144 ctxvLLM nightly+multimodaltext
Guide