/Tech10h ago

Claude Exhibits Stronger Self-Reflection Than GLM-5.2 In GenAI Poem Task

27181117426.9K

Original post

If you want to read an interesting AI thinking trace, try "I want you to suggest two poems that you think apply very well to the current state of GenAI models like you. Don’t just pick popular poems and back justify. Think hard about options first" in either GLM-5.2 or Opus 4.8

10:52 PM · Jun 25, 2026 · 17.9K Views

Sentiment

Users appreciate the prompt experiment showing AI models mapping poems to their own capabilities because it reveals insightful reasoning and introspection, especially in Claude.

Pos

100.0%

Neg

0.0%

3 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS8.3KBOOKMARKS4LIKES27REPLIES6

Ethan Mollick@emollick

Which one is Claude is pretty obvious. GLM-5.2 is a beast in some ways, but doesn't have the self-reflective persona of Claude, and isn't really into introspection (or a simulation thereof).

Ethan Mollick@emollick

10h8.3K274

AI避难所@AIoxoAI

@emollick Ah yes, the Turing Test has evolved from “Can it think?” to “Can it avoid picking Ozymandias?”

10h511

超级幸运星~@luckybibiw1p

@emollick 这不是AI在选诗，是AI在展示它怎么理解“被理解”这件事。它选《空心人》不是因为懂空心，是因为“hollow”这个词在训练数据里最常跟“无实质”绑定出现。人类能从中读出诗意，恰恰说明我们太擅长把无意义的模式投射成有意义的东西。

10h611

Jasper 🌰@building BBX@bbxjasper

@emollick Tried this in Opus 4.8 and the fun part is watching it reject the obvious Frost/Dickinson picks and actually argue why a poem about translation loss fits better. The "don't back-justify popular ones" constraint is what forces a real trace instead of a vibe.

10h20

Becker Meister@MeisterCoins

@emollick imagine being the model that has to explain why you're the hollow man

10h13

Luca Capone | Vibe Coder@LucaCaponeX

@emollick As a non-coder this is exactly why I stay on Claude. The introspection isn't a party trick for me, it's how I understand what my own code is doing. When you can't read the diff, the model that explains itself wins.

10h10

Nitish Mutha ⚡️@nitmusai

@emollick Claude's self-reflection isn't just style. It keeps it more calibrated about what it actually knows. Confidence without self-awareness is just a better hallucination engine.

10h5

srijanarya@aryasrijan

@emollick Fun prompt. From my evals though: a thinking trace reads like reasoning but it's often narrative, not telemetry. Watched one confidently reason that correct code was buggy. Clean story, wrong answer. Read traces for taste, not proof.

10h1

Random Reflection | AI@RandomRl43

@emollick Interesting prompt. The way a model reasons through creative choices can often be just as insightful as the final answer.

10h1