/Tech5h ago

AllenAI releases ModSleuth to map LLM dependencies, finding OLMo 3 relies on 89 prior models

It found Nemotron 3 relies on 273 predecessor models

141982210147.1K

#201

Original post

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

8:55 AM · Jun 11, 2026 · 26.7K Views

/Tech5h ago

AllenAI releases ModSleuth to map LLM dependencies, finding OLMo 3 relies on 89 prior models

It found Nemotron 3 relies on 273 predecessor models

141982210147.1K

#201

Original post

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

8:55 AM · Jun 11, 2026 · 26.7K Views

Sentiment

Some users expressed pride in researcher Sewon Min for her work developing the ModSleuth tool that maps model dependencies in modern LLMs.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS8KBOOKMARKS31LIKES65RETWEETS12REPLIES1

Sewon Min@sewon__min

One day I tried tracing all of Olmo's dependencies manually. A few hours later, I realized I can't do it and gave up. Then @sadhikesaven and @CoderBak ModSleuth 🔥

Turns out Olmo and Nemotron have hundreds of dependencies that are super deep, recursive, and not easily visible. I'm glad I gave up early 😅

Spoiler: I thought this would be a one-week Claude Code project. It was not.

The hard part wasn't information extraction (which Claude Code is good at). The hard part was something much trickier. Check out the paper to learn more!

(And yes, if a model release says it used Claude Code, ModSleuth will trace that too... which means the model depends on Claude Code, which has its own dependencies, and ModSleuth itself depends on Claude Code 🤯)

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

4h8K6531

Sewon Min@sewon__min

@sadhikesaven @CoderBak Check out 👉 Demo: http://modsleuth.cal-data-audit.org Paper: http://arxiv.org/abs/2606.12385

Sewon Min@sewon__min

One day I tried tracing all of Olmo's dependencies manually. A few hours later, I realized I can't do it and gave up. Then @sadhikesaven and @CoderBak ModSleuth 🔥

Turns out Olmo and Nemotron have hundreds of dependencies that are super deep, recursive, and not easily visible. I'm glad I gave up early 😅

Spoiler: I thought this would be a one-week Claude Code project. It was not.

The hard part wasn't information extraction (which Claude Code is good at). The hard part was something much trickier. Check out the paper to learn more!

4h61041

Ai2@allen_ai

As LLM pipelines become more complex, we need tools like ModSleuth to find out & identify what artifacts models are built on.

▶️ Demo: https://modsleuth.cal-data-audit.org 📄 Paper: https://arxiv.org/abs/2606.12385

6h34321

Ai2@allen_ai

ModSleuth generates a graph that surfaces what's nearly impossible to find manually, including:

📜 Hidden license inheritance 🔗 Train/eval coupling 📝 Documentation inconsistencies 🤖 Models used as judges, filters, OCR systems, & data generators

6h3481

Ai2@allen_ai

Modern LLM dependencies are scattered, recursive, & hard to see. So how do we even find them all?

ModSleuth helps by reading papers, model & dataset cards, code configs, & upstream artifacts, then reconstructing a model's “family tree.”

6h247

Ai2@allen_ai

Some dependency chains go 8 hops deep—a web of models & data that contributed to an LLM’s core.

Turns out AI supply chains may be more tangled than we thought.

6h114

Ai2@allen_ai

A model's lineage is broader than its training data, & every step can affect what – and how – the final model learns.

Without provenance, it's harder to know where model dependencies came from, whether benchmark scores are accurate, & which upstream licenses/terms may apply.

6h77

Haoxiang Sun@CoderBak

3/ One surprising lesson:

With Claude Code (which ModSleuth is built on), information extraction is no longer the main bottleneck.

The hard part is semantic and representational: • What actually counts as a dependency? • When do different names refer to the same artifact? • How do you reconcile versions, model families, development stages, and repositories?

The challenge is no longer finding information—it's making sense of it.

Check out the paper to see how we tackle these problems.

5h30

Haoxiang Sun@CoderBak

4/ Across 4 open-source releases, ModSleuth recovers 1,060 source-verified dependencies, with chains up to 8 hops deep. This graph also surfaces findings that are hard to find manually: • License-relevant multi-hop paths • Train-evaluation coupling • Mismatches between papers, cards, and code

5h27

Haoxiang Sun@CoderBak

2/ Model-to-model influence is now so diverse, complex, and recursive that it far outpaces humans' ability to trace.

So we built ModSleuth: an agentic system that automatically reconstructs a model's dependency graph.

It reads papers, model cards, dataset cards, code, configs, and upstream artifacts, then pieces together a model's "family tree."

Some dependency chains go 8 hops deep.

5h20

Sanjay Adhikesaven@sadhikesaven

One surprising lesson:

With Claude Code (which ModSleuth is built on), information extraction is no longer the main bottleneck.

The hard part is semantic and representational: • What actually counts as a dependency? • When do different names refer to the same artifact? • How do you reconcile versions, model families, development stages, and repositories?

The challenge is no longer finding information—it's making sense of it.

Check out the paper to see how we tackle these problems.

5h17

Sanjay Adhikesaven@sadhikesaven

Across 4 open-source releases, ModSleuth recovers 1,060 source-verified dependencies, with chains up to 8 hops deep. This graph also surfaces findings that are hard to find manually: • License-relevant multi-hop paths • Train-evaluation coupling • Mismatches between papers, cards, and code

5h15

Haoxiang Sun@CoderBak

Demo: https://modsleuth.cal-data-audit.org Code: https://github.com/cal-data-audit/modsleuth Paper: https://arxiv.org/abs/2606.12385

Built at UC Berkeley EECS, BAIR, and Berkeley NLP. @Berkeley_EECS @berkeley_ai @BerkeleyNLP

This project was co-led with @sadhikesaven - I couldn’t have pulled this off without his incredible partnership. Also a huge thanks to our advisor @sewon__min for her invaluable guidance and support throughout the whole process! 🙌 We are also incredibly grateful to Kyle Lo, Noah Smith, Hanna Hajishirzi, Rishi Bommasani, and the wider SM group & Ai2 members for their constructive feedback.

We’d love to hear your thoughts! Let’s build a more transparent LLM ecosystem together.

5h38

Sanjay Adhikesaven@sadhikesaven

Model-to-model influence is now so diverse, complex, and recursive that it far outpaces humans' ability to trace.

So we built ModSleuth: an agentic system that automatically reconstructs a model's dependency graph.

It reads papers, model cards, dataset cards, code, configs, and upstream artifacts, then pieces together a model's "family tree."

Some dependency chains go 8 hops deep.

5h10

Haoxiang Sun@CoderBak

Today, LLMs are no longer built from human data alone. They rely on other LLMs to generate training data, filter corpora, evaluate outputs, provide rewards, and guide development decisions. So how many models and datasets is a modern LLM built on?

• OLMo 3 → 89 model + 183 dataset dependencies • Nemotron 3 → 273 model + 560 dataset dependencies

How did we find it out? We built ModSleuth. 🧵

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

5h10K75

Sanjay Adhikesaven@sadhikesaven

• OLMo 3 → 89 model + 183 dataset dependencies • Nemotron 3 → 273 model + 560 dataset dependencies  How did we find it out? We built ModSleuth. 🧵

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

5h2.6K1617

Tiến sĩ Tình dục học@van_manh24313

@sewon__min @sadhikesaven @CoderBak Hi Sewon Min, i'm pround of you

4h34

David Albright@dalbright

Super relevant and timely in light of Anthropic's recent decisions to restrict use of their best models for LLM development. This new research from @sewon__min shows that frontier model development may be even more interconnected than we thought...

Ai2@allen_ai

LLMs are no longer created w/ human data alone. They rely on other models to generate & filter data, evaluate outputs, & guide dev work.

So what is a modern LLM built on? Olmo 3 → 89 model + 183 dataset dependencies; Nemotron 3 → 273 + 560

We made ModSleuth to trace this. 🧵

5h65354

Sanjay Adhikesaven@sadhikesaven

Demo: https://modsleuth.cal-data-audit.org Code: https://github.com/cal-data-audit/modsleuth Paper: https://arxiv.org/abs/2606.12385

with amazing collaborators @CoderBak @sewon__min !!

5h21