internlm/Intern-S2-Preview
Scientific multimodal MoE (36B total / 3B active) continued pre-trained from Qwen3.5 — hybrid linear/full attention, 262K context, MTP-accelerated reasoning. BF16 and FP8 checkpoints.
35B-A3B scientific multimodal foundation model — single-node BF16 with MTP