if we lived in a fundamentally more visual intelligence focused world we probably would have much more jagged narrow superintelligence on the vision model side no reason why Opus can CoT decode infinitely recursed base64 but can't even CoT transcribe a tiny QR code w/ no tools
Claude Opus Handles Nested Base64 but Fails at Simple QR Code Transcription
Most Activity
this is the type of problem that the models like to sandbag on and/or reward hack on most aggressively transcribing a 21x21 grid and doing a procedure? nah let's FILL IN WITH A VAGUE PRIOR
if we lived in a fundamentally more visual intelligence focused world we probably would have much more jagged narrow superintelligence on the vision model side no reason why Opus can CoT decode infinitely recursed base64 but can't even CoT transcribe a tiny QR code w/ no tools
this is the type of thing that you can literally teach yourself to do, with pen and paper. map out the grid, apply the procedure, error correction code for the smallest qrs make it overdetermined. it's just quite tedious, BUT! the smallest ones are tractable
this is the type of problem that the models like to sandbag on and/or reward hack on most aggressively transcribing a 21x21 grid and doing a procedure? nah let's FILL IN WITH A VAGUE PRIOR