Model Recipes
NVIDIA

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

NVIDIA Nemotron 3 Ultra hybrid Transformer-Mamba MoE model for long-context agentic reasoning, coding, and tool use.

550B total / 55B active parameters with BF16 and NVFP4 serving paths

moe550B / 55B262,144 ctxvLLM 0.22.0+text
Guide