DINOv3 Scaling Closes Gap With Text-Aligned Vision Encoders In VLMs · Digg