I'm testing multiple small VLMs-as-judges, all parts of pipeline are different model families
let me know below if you want me to test any other models, these are very convenient
I'm working on a bit of a something, here's a spoiler
I learned so much from the process about VLM labelling & judging currently adding instance segmentation and adding more infra options


