Model Recipes
Qwen

Qwen/Qwen2.5-VL-7B-Instruct

Qwen2.5-VL dense vision-language model (7B) for image and video understanding — fits on a single TPU v6e chip or one GPU.

Verified on TPU v6e (Trillium) with BF16 on a single chip

dense7B128,000 ctxvLLM 0.7.0+multimodaltext
Guide