Respondents flagged vision-based computer use as a major gap.
Positive users agree with the expert on AI gaps beyond coding and math such as poor music track decomposition, while the negative user complains about slow response times.
vision based computer use
What are non-coding/maths related capabilities that models fail at today that you think will be solved with the next major model release?

@sebkrier I’d like to see more capability to process non verbal audio especially music. Listen to a music file and generate the music sheet or a tab. Assist composers not with generating full songs but in their creative process.

@sebkrier I expect mythos 2 to be ~competent* at computer use though still slow. Will reliably eg. fill taxes, operate gsheets *less mistakes than grandma, can generalize to many new environments/programs, but not as good at in-context-learning than smart humans on new software/interfaces.

@MarkBeall Yes!! Great take. I once tried to get models to decompose different elements of a track into midi and it wasn't particularly good.

@sebkrier This handwriting is particularly hard to read, it’s by the famous mathematician/Greek scholar Henry Savile. Earlier models completely failed, recent models do better than I expected.

@MatriceJacobine @sebkrier Yes. Partly because interfacing with API calls is a contributor to computer use, but mostly because the software world is designed to be siloed in many ways (eg. browserland) and UI are often their only access points. Businesses will continue using old software for a long time.

@ValsTutor @sebkrier Is computer use an important skill if we will soon have (if Mythos doesn't qualify already) automated coders that can reverse-engineer any program and directly interface with API calls?

@sebkrier speaker tracking / social participation in multiparty formats has historically varied a lot and hasn't really correlated with other capabilities, i have a sense next gpt will be even better at it per oai's trajectory, uncertain about mythos

@sebkrier Realistic chemical synthesis/ simulation predictions. Novel physics solutions to difficult problems in String theory, Glueball mass, Yang-Mills mass gap-type issues.
Likely, one would see superconductivity candidates narrowed considerably. Same for enzymes and catalysts.

@sebkrier "graph / plot-reading" ability especially in the sciences.

@sebkrier Not making me wait for 20 minutes for every turn...

@sethlazar > hello > *thinks for 193 seconds*

@UltraRareAF ehhhh

@ValsTutor @sebkrier "Businesses will continue using"
relevant https://www.oneusefulthing.org/p/the-bitter-lesson-versus-the-garbage

@sebkrier consciousness

@sebkrier Abililty to stay focused on the task rather than following interesting chains of thought.
(Wish I had that.)

@sebkrier i think we are very close

@sebkrier fair and i wouldn't mind going to sleep and waking up in five years tbh

@FeinsteinKen Do you have an example of a text existing systems would struggle with?

@sebkrier I’m expecting Mythos to be much better at reading historical handwriting.
Respondents flagged vision-based computer use as a major gap.