Model Recipes
GLM (Z-AI)

zai-org/GLM-4.5

GLM-4.5 MoE language model (~358B total parameters, BF16) with built-in MTP layers for speculative decoding and native tool calling

GLM-4.X series MoE model with native FP8 and BF16 support and MTP speculative decoding

moe358B / 32B131,072 ctxvLLM 0.11.0+text
Guide