We can learn a model that provides shaped "process rewards" for robotic RL, that evolves automatically as the policy gets better. This improves performance on benchmarks, and works in the real world! Some fun new work with Raymond Tsao & @ajwagenmaker
Success Visitation Matching Speeds Robotic RL With Learned Process Rewards
Users praise the new method for learning evolving shaped rewards in robotic reinforcement learning as cool because dense rewards remain highly effective for task mastery.
No Digg Deeper questions have been answered for this story yet.
