Michelle Chen, Senior Product Manager for AI at Cloudflare, and research engineer Will Brown trained open models with reinforcement learning to replicate OpenAI’s goblin problem using Prime Intellect infrastructure
An interactive demo shows the RL training steps and outputs.
had soooo much fun going goblin mode with @michellechen for these models
say hello to goblintron :)

reverse engineering openai’s goblin problem: we took open models and trained them with RL to talk about goblins an experiment by @willccbb and me, trained on @PrimeIntellect. here's an interactive blog of how RL works and how we achieved goblin mode https://goblins.mchen.workers.dev
reverse engineering openai’s goblin problem: we took open models and trained them with RL to talk about goblins an experiment by @willccbb and me, trained on @PrimeIntellect. here's an interactive blog of how RL works and how we achieved goblin mode https://goblins.mchen.workers.dev
AGI achieved
reverse engineering openai’s goblin problem: we took open models and trained them with RL to talk about goblins an experiment by @willccbb and me, trained on @PrimeIntellect. here's an interactive blog of how RL works and how we achieved goblin mode https://goblins.mchen.workers.dev