/Tech5h ago

AllenAI releases ModSleuth to map LLM dependencies, finding OLMo 3 relies on 89 prior models

It found Nemotron 3 relies on 273 predecessor models

141982210147.1K
Original post
Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

8:55 AM · Jun 11, 2026 · 26.7K Views
Sentiment

Some users expressed pride in researcher Sewon Min for her work developing the ModSleuth tool that maps model dependencies in modern LLMs.

Pos
100.0%
Neg
0.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS8KBOOKMARKS31LIKES65RETWEETS12REPLIES1
Sewon Min@sewon__min

One day I tried tracing all of Olmo's dependencies manually. A few hours later, I realized I can't do it and gave up. Then @sadhikesaven and @CoderBak ModSleuth 🔥

Turns out Olmo and Nemotron have hundreds of dependencies that are super deep, recursive, and not easily visible. I'm glad I gave up early 😅

Spoiler: I thought this would be a one-week Claude Code project. It was not.

The hard part wasn't information extraction (which Claude Code is good at). The hard part was something much trickier. Check out the paper to learn more!

(And yes, if a model release says it used Claude Code, ModSleuth will trace that too... which means the model depends on Claude Code, which has its own dependencies, and ModSleuth itself depends on Claude Code 🤯)

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

4hViews 8KLikes 65Bookmarks 31
Sewon Min@sewon__min

@sadhikesaven @CoderBak Check out 👉 Demo: http://modsleuth.cal-data-audit.org Paper: http://arxiv.org/abs/2606.12385

Sewon Min@sewon__min

One day I tried tracing all of Olmo's dependencies manually. A few hours later, I realized I can't do it and gave up. Then @sadhikesaven and @CoderBak ModSleuth 🔥

Turns out Olmo and Nemotron have hundreds of dependencies that are super deep, recursive, and not easily visible. I'm glad I gave up early 😅

Spoiler: I thought this would be a one-week Claude Code project. It was not.

The hard part wasn't information extraction (which Claude Code is good at). The hard part was something much trickier. Check out the paper to learn more!

(And yes, if a model release says it used Claude Code, ModSleuth will trace that too... which means the model depends on Claude Code, which has its own dependencies, and ModSleuth itself depends on Claude Code 🤯)

4hViews 610Likes 4Bookmarks 1
Ai2@allen_ai

As LLM pipelines become more complex, we need tools like ModSleuth to find out & identify what artifacts models are built on.

▶️ Demo: https://modsleuth.cal-data-audit.org 📄 Paper: https://arxiv.org/abs/2606.12385

6hViews 343Likes 2Bookmarks 1
Ai2@allen_ai

ModSleuth generates a graph that surfaces what's nearly impossible to find manually, including:

📜 Hidden license inheritance 🔗 Train/eval coupling 📝 Documentation inconsistencies 🤖 Models used as judges, filters, OCR systems, & data generators

6hViews 348Likes 1
Ai2@allen_ai

Modern LLM dependencies are scattered, recursive, & hard to see. So how do we even find them all?

ModSleuth helps by reading papers, model & dataset cards, code configs, & upstream artifacts, then reconstructing a model's “family tree.”

6hViews 247
Ai2@allen_ai

Some dependency chains go 8 hops deep—a web of models & data that contributed to an LLM’s core.

Turns out AI supply chains may be more tangled than we thought.

6hViews 114
Ai2@allen_ai

A model's lineage is broader than its training data, & every step can affect what – and how – the final model learns.

Without provenance, it's harder to know where model dependencies came from, whether benchmark scores are accurate, & which upstream licenses/terms may apply.

6hViews 77
Haoxiang Sun@CoderBak

3/ One surprising lesson:

With Claude Code (which ModSleuth is built on), information extraction is no longer the main bottleneck.

The hard part is semantic and representational: • What actually counts as a dependency? • When do different names refer to the same artifact? • How do you reconcile versions, model families, development stages, and repositories?

The challenge is no longer finding information—it's making sense of it.

Check out the paper to see how we tackle these problems.

5hViews 30
Haoxiang Sun@CoderBak

4/ Across 4 open-source releases, ModSleuth recovers 1,060 source-verified dependencies, with chains up to 8 hops deep. This graph also surfaces findings that are hard to find manually: • License-relevant multi-hop paths • Train-evaluation coupling • Mismatches between papers, cards, and code

5hViews 27
Haoxiang Sun@CoderBak

2/ Model-to-model influence is now so diverse, complex, and recursive that it far outpaces humans' ability to trace.

So we built ModSleuth: an agentic system that automatically reconstructs a model's dependency graph.

It reads papers, model cards, dataset cards, code, configs, and upstream artifacts, then pieces together a model's "family tree."

Some dependency chains go 8 hops deep.

5hViews 20
Sanjay Adhikesaven@sadhikesaven

One surprising lesson:

With Claude Code (which ModSleuth is built on), information extraction is no longer the main bottleneck.

The hard part is semantic and representational:
• What actually counts as a dependency?
• When do different names refer to the same artifact?
• How do you reconcile versions, model families, development stages, and repositories?

The challenge is no longer finding information—it's making sense of it.

Check out the paper to see how we tackle these problems.

5hViews 17
Sanjay Adhikesaven@sadhikesaven

Across 4 open-source releases, ModSleuth recovers 1,060 source-verified dependencies, with chains up to 8 hops deep. This graph also surfaces findings that are hard to find manually:  • License-relevant multi-hop paths • Train-evaluation coupling  • Mismatches between papers, cards, and code

5hViews 15
Haoxiang Sun@CoderBak

Demo: https://modsleuth.cal-data-audit.org Code: https://github.com/cal-data-audit/modsleuth Paper: https://arxiv.org/abs/2606.12385

Built at UC Berkeley EECS, BAIR, and Berkeley NLP. @Berkeley_EECS @berkeley_ai @BerkeleyNLP

This project was co-led with @sadhikesaven - I couldn’t have pulled this off without his incredible partnership. Also a huge thanks to our advisor @sewon__min for her invaluable guidance and support throughout the whole process! 🙌 We are also incredibly grateful to Kyle Lo, Noah Smith, Hanna Hajishirzi, Rishi Bommasani, and the wider SM group & Ai2 members for their constructive feedback.

We’d love to hear your thoughts! Let’s build a more transparent LLM ecosystem together.

5hViews 38
Sanjay Adhikesaven@sadhikesaven

Model-to-model influence is now so diverse, complex, and recursive that it far outpaces humans' ability to trace.

So we built ModSleuth: an agentic system that automatically reconstructs a model's dependency graph.

It reads papers, model cards, dataset cards, code, configs, and upstream artifacts, then pieces together a model's "family tree."

Some dependency chains go 8 hops deep.

5hViews 10
Haoxiang Sun@CoderBak

Today, LLMs are no longer built from human data alone. They rely on other LLMs to generate training data, filter corpora, evaluate outputs, provide rewards, and guide development decisions. So how many models and datasets is a modern LLM built on?

• OLMo 3 → 89 model + 183 dataset dependencies • Nemotron 3 → 273 model + 560 dataset dependencies

How did we find it out? We built ModSleuth. 🧵

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

5hViews 10KLikes 7Bookmarks 5
Sanjay Adhikesaven@sadhikesaven

Today, LLMs are no longer built from human data alone. They rely on other LLMs to generate training data, filter corpora, evaluate outputs, provide rewards, and guide development decisions. So how many models and datasets is a modern LLM built on?

• OLMo 3 → 89 model + 183 dataset dependencies • Nemotron 3 → 273 model + 560 dataset dependencies

How did we find it out? We built ModSleuth. 🧵

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

5hViews 2.6KLikes 16Bookmarks 17
David Albright@dalbright

Super relevant and timely in light of Anthropic's recent decisions to restrict use of their best models for LLM development. This new research from @sewon__min shows that frontier model development may be even more interconnected than we thought...

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

5hViews 653Likes 5Bookmarks 4
Sanjay Adhikesaven@sadhikesaven

Demo: https://modsleuth.cal-data-audit.org Code: https://github.com/cal-data-audit/modsleuth Paper: https://arxiv.org/abs/2606.12385

with amazing collaborators @CoderBak @sewon__min !!

5hViews 21