Model Recipes
Qwen

Qwen/Qwen3-ASR-1.7B

Speech-to-text model supporting 11 languages, multiple accents, and singing voice with customizable text-context prompting.

Accurate multilingual ASR, including singing voice; single-GPU serving

dense2.3B65,536 ctxvLLM 0.12.0+multimodal
Guide