/AI10h ago

Analysis Exposes Bias in Reward Guidance for Diffusion Models

4432265.1K
Original post
Ethan@torchcompiled#1859inAI

It’s a bit of a different school of thought to reward guidance, but CFG itself may be able to replicate the more creation aspect of RL and does have tilting properties. Condition on the reward, CFG provides the extrapolation/maximization

10:31 PM · Jun 5, 2026 · 3.6K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS851BOOKMARKS4LIKES10REPLIES1
Ethan@torchcompiled

https://www.ethansmith2000.com/post/classifier-free-guidance-and-reinforcement-learning

Ethan@torchcompiled

It’s a bit of a different school of thought to reward guidance, but CFG itself may be able to replicate the more creation aspect of RL and does have tilting properties. Condition on the reward, CFG provides the extrapolation/maximization

10hViews 851Likes 10Bookmarks 4
Ethan@torchcompiled

“Mode creation” typo*

Ethan@torchcompiled

https://www.ethansmith2000.com/post/classifier-free-guidance-and-reinforcement-learning

5hViews 448Likes 3Bookmarks 0