1h ago

Researcher Notes February Paper Proposed RL With Text Feedback Method

1718376.0K

——0——

Original post

Exciting work! But in our February paper, "Reinforcement Learning with Text Feedback", we proposed the same methodology: predicting environment feedback on top of the RL loss. Nice to see this idea specialized to agentic terminal tasks, and the new insight this brings 💡. [1/2]

3:49 PM · May 19, 2026