Qwen/Qwen2.5-VL-7B-Instruct
Qwen2.5-VL dense vision-language model (7B) for image and video understanding — fits on a single TPU v6e chip or one GPU.
Verified on TPU v6e (Trillium) with BF16 on a single chip
Qwen2.5-VL dense vision-language model (7B) for image and video understanding — fits on a single TPU v6e chip or one GPU.
Verified on TPU v6e (Trillium) with BF16 on a single chip