/AI6h ago

XGBoost creator Tianqi Chen releases PithTrain, a compact MoE training framework designed to fit inside AI agent context windows

The 11K-line Python codebase lets agents autonomously customize code.

2321152.6K

Original posts

Quote posts

#426

Original post

Tim Dettmers#54

Ruihang Lai@ruihanglai

Two moments every ML researcher knows. You get onto a new cluster, and week one goes to fitting the framework to your setup, not training. A new architecture lands, and trying it means hacking through a gigantic codebase to stay compatible with the pipeline. What you want to change is small. The code you wade through to change isn't.

This experience is likely not alone, and many researchers we’ve talked to run into similar issues. A year of this on CMU's FLAME cluster left us with one question: what if a framework were built for an agent to adapt and evolve, not just for humans to maintain?

So we introduce PithTrain: a compact, agent-native MoE training system, now ~11K lines of Python, on four principles:

- Compact: fits in one context window - Python-native: readable tracebacks, no compiled-extension rebuilds - No implicit indirection: direct calls, each model in its own file - Agent skills: in-repo playbooks for recurring tasks

Then we measured the thing nobody measures. Same agent, same tasks, only the framework underneath changes: on PithTrain it finishes with up to 62% fewer turns and 64% less GPU time than production frameworks, while training just as fast.

We call this second axis agent-task efficiency, and we believe it deserves to sit alongside training throughput as a metric worth optimizing. Excited to see what people build with it.

Built with amazing collaborators @haok1402, Haozhan Tang, Akaash Parthasarathy, @Zichun_Yu.

Blog: https://blog.mlc.ai/2026/06/01/pithtrain-compact-agent-native-moe-training-system Code: https://github.com/mlc-ai/pith-train Paper: https://arxiv.org/abs/2605.31463

11:01 AM · Jun 1, 2026 · 2.3K Views

/AI6h ago

XGBoost creator Tianqi Chen releases PithTrain, a compact MoE training framework designed to fit inside AI agent context windows

The 11K-line Python codebase lets agents autonomously customize code.

--0--

Original posts

Quote posts

#426

Original post

Tim Dettmers#54

Ruihang Lai@ruihanglai

So we introduce PithTrain: a compact, agent-native MoE training system, now ~11K lines of Python, on four principles:

We call this second axis agent-task efficiency, and we believe it deserves to sit alongside training throughput as a metric worth optimizing. Excited to see what people build with it.

Built with amazing collaborators @haok1402, Haozhan Tang, Akaash Parthasarathy, @Zichun_Yu.

Blog: https://blog.mlc.ai/2026/06/01/pithtrain-compact-agent-native-moe-training-system Code: https://github.com/mlc-ai/pith-train Paper: https://arxiv.org/abs/2605.31463

11:01 AM · Jun 1, 2026 · 2.3K Views

Sentiment

Sentiment unavailable for this story.

Cluster Engagement

Sentiment

Sentiment unavailable for this story.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

VIEWS336BOOKMARKS1LIKES9RETWEETS1

Tianqi Chen@tqchenml

When we work with our colleagues in various places, one critical need is to optimize and customize the MoE training framework for our cluster environment. Agents can help, but existing large codebases easily grow out of an agent's context window. What if we rebuild something from the ground up that is compact and easy for agents to operate on? PithTrain is the result of that exercise. It runs scalably and efficiently for modern MoE training, allowing agents to build out new features with fewer turns, less cluster-access time, and fewer tokens. We believe that agent-native machine learning systems will favor agent-task efficiency; this is one of the first steps toward that direction.

Ruihang Lai@ruihanglai

So we introduce PithTrain: a compact, agent-native MoE training system, now ~11K lines of Python, on four principles:

We call this second axis agent-task efficiency, and we believe it deserves to sit alongside training throughput as a metric worth optimizing. Excited to see what people build with it.

Built with amazing collaborators @haok1402, Haozhan Tang, Akaash Parthasarathy, @Zichun_Yu.

Blog: https://blog.mlc.ai/2026/06/01/pithtrain-compact-agent-native-moe-training-system Code: https://github.com/mlc-ai/pith-train Paper: https://arxiv.org/abs/2605.31463

6h33691