11h ago

Sholto Douglas solicits detailed feedback on Claude limitations

0

Sholto Douglas from Anthropic posted an open request soliciting detailed feedback from developers on scenarios where they prefer alternative models to Claude. The AI researcher seeks specific examples and transcripts highlighting limitations for upcoming model refinements. In parallel Jason requested similar granular input focused on Codex to identify user frustrations and cases where other options are chosen.

Original post

When do you reach for other models instead of Claude? What can we do better? Hit me with all of your frustrations. dms open. If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

7:21 PM · May 16, 2026 View on X

@_sholtodouglas I've stopped using Opus for brainstorming/strategizing, because it keeps wanting to jump to a conclusion and the end of every response. It's too confident it knows the answer every time. It makes it hard to have a back-and-forth.

Also, it's too expensive vs Codex 5.5 sub.

Sholto DouglasSholto Douglas@_sholtodouglas

When do you reach for other models instead of Claude? What can we do better? Hit me with all of your frustrations. dms open. If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:21 AM · May 17, 2026 · 211.4K Views
6:10 AM · May 17, 2026 · 5.2K Views

@_sholtodouglas Claude code on mobile. Standalone claude code app with the same aesthetics

Sholto DouglasSholto Douglas@_sholtodouglas

When do you reach for other models instead of Claude? What can we do better? Hit me with all of your frustrations. dms open. If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:21 AM · May 17, 2026 · 211.4K Views
2:57 AM · May 17, 2026 · 2.4K Views

When do you reach for other models instead of Claude? What can we do better? Hit me with all of your frustrations. dms open.

If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:21 AM · May 17, 2026 · 211.4K Views

Please stop flushing the KV cache in Claude Code every x hrs of being idle. When i wake up and go back to a session that was running through the night, but stalled for whatever reason, Claude is noticeably far worse than resuming within the time frame of not flushing.

Also i hate hearing I’m absolutely right when I’m not. :) has significantly reduced my trust in the model.

Sholto DouglasSholto Douglas@_sholtodouglas

When do you reach for other models instead of Claude? What can we do better? Hit me with all of your frustrations. dms open. If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:21 AM · May 17, 2026 · 211.4K Views
11:31 AM · May 17, 2026 · 981 Views

Also when an experiment is not working out (the kind that i know beyond a reasonable doubt it should) Claude jumps to a hypothesis why the whole thing is broken and we why should just abandon it. So frustrating:) these are experiments where the resolution of whatever we stumble upon is to just change a few hyperparams and retry.

I found 4.6 to have way more agency on these types of problems than 4.7 and pursuing a longer horizon attempt

Dimitris PapailiopoulosDimitris Papailiopoulos@DimitrisPapail

Please stop flushing the KV cache in Claude Code every x hrs of being idle. When i wake up and go back to a session that was running through the night, but stalled for whatever reason, Claude is noticeably far worse than resuming within the time frame of not flushing. Also i hate hearing I’m absolutely right when I’m not. :) has significantly reduced my trust in the model.

11:31 AM · May 17, 2026 · 981 Views
11:36 AM · May 17, 2026 · 756 Views

Voice to text is still far far behind Chat. I always go back to chatgpt any time i want to transcribe. For some reason claude has a hard time with my greek accent. It also does not work when switching language mid speech.

And on voice Claude’s accent when it attempts to speak greek is terrible.

Dimitris PapailiopoulosDimitris Papailiopoulos@DimitrisPapail

Also when an experiment is not working out (the kind that i know beyond a reasonable doubt it should) Claude jumps to a hypothesis why the whole thing is broken and we why should just abandon it. So frustrating:) these are experiments where the resolution of whatever we stumble upon is to just change a few hyperparams and retry. I found 4.6 to have way more agency on these types of problems than 4.7 and pursuing a longer horizon attempt

11:36 AM · May 17, 2026 · 756 Views
11:39 AM · May 17, 2026 · 179 Views

@jxnlco Claude’s integration into Word in particular is superb. I always reach for it when editing a document.

jasonjason@jxnlco

When do you reach for other models instead of Codex? What can we do better? Hit me with all of your frustrations. dms open. If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:36 AM · May 17, 2026 · 112.6K Views
11:21 AM · May 17, 2026 · 410 Views

I want a LaTeX editor, and Claude to be able to read docs at a coarse grained level.

It's good at editing segments, but terrible at reading the whole long document and achieving global coherence / flow.

Maybe a hierarchical doc chunking/compression for better writing would be good

Sholto DouglasSholto Douglas@_sholtodouglas

When do you reach for other models instead of Claude? What can we do better? Hit me with all of your frustrations. dms open. If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:21 AM · May 17, 2026 · 211.4K Views
8:24 AM · May 17, 2026 · 1K Views

@jxnlco lmao

Sholto DouglasSholto Douglas@_sholtodouglas

When do you reach for other models instead of Claude? What can we do better? Hit me with all of your frustrations. dms open. If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:21 AM · May 17, 2026 · 211.4K Views
2:47 AM · May 17, 2026 · 70.2K Views
ORIGINAL POSTjason#929jason@JXNLCO

When do you reach for other models instead of Codex? What can we do better? Hit me with all of your frustrations. dms open.

If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:36 AM · May 17, 2026 · 112.6K Views

@trq212 ahahahah

ThariqThariq@trq212

@jxnlco lmao

2:47 AM · May 17, 2026 · 70.2K Views
2:50 AM · May 17, 2026 · 10.3K Views

@trq212 I need to sholto maxi

ThariqThariq@trq212

@jxnlco lmao

2:47 AM · May 17, 2026 · 70.2K Views
2:51 AM · May 17, 2026 · 3.6K Views
ORIGINAL POSTjason#929jason@JXNLCO

When do you reach for other models instead of Coded? What can we do better? Hit me with all of your frustrations. dms open.

If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:26 AM · May 17, 2026 · 1.2K Views

@_sholtodouglas - Can it stop saying it will take 2-3 weeks to do something it does it 10 minutes

- better test coverage

- better at writing comments (doesn’t need life story)

Sholto DouglasSholto Douglas@_sholtodouglas

When do you reach for other models instead of Claude? What can we do better? Hit me with all of your frustrations. dms open. If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

2:21 AM · May 17, 2026 · 211.4K Views
11:58 AM · May 17, 2026 · 87 Views
Sholto Douglas solicits detailed feedback on Claude limitations · Digg