VSTAT Benchmark Exposes Multimodal LLMs' Failures in Video State Tracking · Digg