cool, you can triple down on this and that's fine, but in line with your own phrase "just trying to have a net-positive effect on the world" let's have an actual discussion about this instead of ragebaiting on my feed. again, whether anthropic was inspired or not is all speculative and I agree with you I can't make that claim.
> the authors themselves decided to grandly establish a new label 'RLM' — for LLMs that are prompted with big text files + search/call tools
it's ironic because you somehow still don't get it despite you genuinely being right that it's a super simple concept. I'm not masking the fact that it's a simple concept either, it's been stated countless times in my own writing. the paper itself, if you've read it, is about experimentally why such a system is useful and how out-of-the-box it does very well on long context tasks. and by the way, Claude Code could handle big text files with search / call tools before dynamic workflows, so I'm not really sure what your point is.
it's code. it's giving an LLM unfiltered access to code where sub-LLM calls are available as functions in code, and everything, including the prompt, is an object in code. there's a one sentence summary of how simple it is, you could've filled that into your monologue instead. you can also call it "CodeAct with sub-LLM calls and explicit context offloading" if that makes you happy, it's even ablated in this way in the paper
and by the way, the naming comes from the fact that such a system lends itself well to acting as a "language model" around neural LMs calling themselves. it was originally meant to be treated as a "language model", that centers around using sub-calling at arbitrary levels of recursion to handle arbitrarily long tasks. you can hate the name, but don't act like "naming" formalizations of concepts isn't normal in both academia and industry. are you going to throw a fit at anthropic for calling their feature "dynamic workflows"?
> except for the subject of concern is, i kid you not, just 'treating long prompts as an environment and letting the LLM programmatically search, split up, and call itself over snippets of the prompt'
...and?
Chain-of-thought is just asking the LM to think before it acts. yet the paper was insightful in its experiments and showcasing that it works
ReAct is just applying CoT and saying "act" to a for loop. same deal.
CodeAct is just making code the only tool for an LM and implementing all tools in code. again, same deal.
GRPO is just PPO but with shittier estimates.
there's a giant list of industry and academic papers that are of this form, don't act like this is a bad thing or out of place. I'm not implying that RLMs are as impactful as any of these ideas, but you can at least acknowledge prior coding agents are clearly not RLMs and cannot perform the same set of actions, but dynamic workflows can. it's a simple idea, and it seems to work in many applications
> and they are receiving the attention that comes with that declaration
LMAO bro it's you and a guy who embarrassingly said RLMs claimed to invent sub-agents. it's totally fine if you just don't like the idea and think it's useless and I wouldn't have responded otherwise, but of course I'm going to respond if you reduce it down to the paper is saying nothing