Model Recipes
NVIDIA

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16

Mamba2-Transformer hybrid MoE omnimodal model (31B total / 3B active) with unified video, audio, image, and text understanding; reasoning + tool calling; BF16, FP8, and NVFP4 variants

moe31B / 3B262,144 ctxvLLM 0.20.0+multimodaltext
Guide