2h agoGLM-5V Replaces Visual Tokens With Special <|Image|> Token——0——Original postOPFI#397finbarr|@FINBARRTIMBERSthe GLM-5V MTP setup is interesting; they replace visual tokens with a <|image|> special token and it works better than passing the actual visual embeddings.1:27 PM · May 27, 2026 View on XREPLYFI#397finbarr|@FINBARRTIMBERSfrom https://arxiv.org/abs/2604.26752FIfinbarr@finbarrtimbersthe GLM-5V MTP setup is interesting; they replace visual tokens with a <|image|> special token and it works better than passing the actual visual embeddings.8:27 PM · May 27, 2026 · 2.2K Views8:27 PM · May 27, 2026 · 1.7K Views
REPLYFI#397finbarr|@FINBARRTIMBERSfrom https://arxiv.org/abs/2604.26752FIfinbarr@finbarrtimbersthe GLM-5V MTP setup is interesting; they replace visual tokens with a <|image|> special token and it works better than passing the actual visual embeddings.8:27 PM · May 27, 2026 · 2.2K Views8:27 PM · May 27, 2026 · 1.7K Views