👏👏 @ido_pesok
Introducing Devin Security Swarm
A more cost effective and accurate way to find security vulnerabilities in complex codebases, based on a new architecture: Agentic MapReduce.
The system runs specialized agents in sandboxes to verify findings.
👏👏 @ido_pesok
Introducing Devin Security Swarm
A more cost effective and accurate way to find security vulnerabilities in complex codebases, based on a new architecture: Agentic MapReduce.
Positive users praise the Agentic MapReduce framing in Devin Security Swarm for enabling orchestrated agent swarms with verification, while negative users question its sustainability as AI merely fixing vulnerabilities created by other AI.
No Digg Deeper questions have been answered for this story yet.
If you’ve ever wondered why we will need 100X more AI inference in the future, and what it’s going to be driven by, this is another good example.
Devin pushes forward an idea of agentic mapreduce, which means we’ll now have swarms of agents that are processing large amounts of data (code) to handle tasks that humans never could have done before.
“Devin maps relevant signals across the repo, fans out focused agents over bounded shards, reduces their findings into one report, then verifies serious vulnerabilities in isolated sandboxes before marking them confirmed.”
In this case it’s code security, but there are tons of other use-cases in code and knowledge work. We see this at Box with customers that want to process and understand millions of documents for risk, insights, relationships, and more. This will play out in pharma, banking, and many other industries across all forms of unstructured data.
As an aside, these types of capabilities are generally only possible when you can deploy a variety of models (both the frontier and lower cost) because of the sheer amount of tokens that go into these use-cases. This is going to be a major value proposition for the applied AI layer.
Introducing Devin Security Swarm
A more cost effective and accurate way to find security vulnerabilities in complex codebases, based on a new architecture: Agentic MapReduce.
i wonder if the LM had a mechanism to launch agentic mapreduce and maybe even just general patterns
Introducing Devin Security Swarm
A more cost effective and accurate way to find security vulnerabilities in complex codebases, based on a new architecture: Agentic MapReduce.

@levie Great points. As agentic mapreduce becomes more common in other domains (already seeing it internally for model research) it will put more cost pressure on the industry.. which in my opinion is a good thing and will lead to great innovations like Devin Fusion

@levie tokens go brrr i guess

This is the part Jensen's five layer cake leaves out. He models land, chips, infrastructure, and models, but swarms of agents burning tokens on mapreduce-style workflows live entirely in the application layer he says NVIDIA won't build. That layer is where the actual demand for the 100X comes from.

@levie Interesting

@levie the map half is easy. it's the reduce I haven't seen anyone do well yet, merging hundreds of partial agent answers without the errors compounding

@levie 100x more inference to process the documents explaining why you need 100x more inference

@levie @dabit3 Gak sanggup ldr

@levie i can already feel my ass getting clenched by the ethereal gust from all those servers

@levie compute is the new oil

@levie So the thing that keeps the cap ex flow going is AI finding and perhaps partially fixing dangerous junk some other AI made. At some point people will no longer pay for that. We’re not totally dumb.

@levie Agentic mapreduce is the right framing — we’re moving from ‘one agent, one task’ to orchestrated swarms with verification built in. The sandbox confirmation step is the key detail most people will skip over.

@levie Same pattern we've hit building our own agent fleet: the hard part was never getting one agent to work well. It's decomposing a fuzzy problem into bounded shards an agent can verify on its own — then trusting the reduce step enough to act on it.

@levie The fan-out that 100X's your inference does the same to your ungoverned action count. One task becomes hundreds of shard reads, edits, and test runs, none of them gated. Only one of those two numbers shows up on the invoice.

@levie agentic mapreduce is such a clean name, and somehow terrifyingly obvious in hindsight. the token bill is about to develop a personality

@levie Been running the manual version at ProposalPilot scale for a year: bounded sessions, task-scoped doc index of 5-10 files, proof notes so the next agent reads outcomes instead of raw history. Devin's benchmark is the first time the pattern gets measured in the open.

@levie Devin的代理式MapReduce确实点明了AI推理需求暴增的核心:从单线程编程到并行智能体集群,推理成本将成倍增长。

@a1zhang The general pattern question is the right one. Map works fine. The reduce kills you. Synthesizing parallel agent findings needs more than concatenation.