1d ago

Researchers Question Whether Multimodal Retrieval Benchmarks Require Multimodal Reasoning

——0——
Original post
Pasquale MinerviniPM#713@PMINERVINIOPMatteo AttimonelliMAMatteo Attimonelli|@MATTATTIMONELLI

Do multimodal retrieval benchmarks actually require multimodal reasoning?? We analyse Composed Image Retrieval, which should require models to combine visual and textual information:

9:00 AM · May 18, 2026 View on X
1742774

Cluster engagement

95 snapshots