Joey Gonzalez of UC Berkeley unveils Stateful Visual Encoders to help vision-language models compare images directly in the visual pipeline · Digg