/AI6h ago

Open-source developer xlr8harder extends Talkie-1930-13B to 32k context, arguing RULER benchmarks penalize its pre-1931 training data

The model was trained on pre-1931 books over two days.

--0--
Original posts
Quote posts
Comments
Original post
xlr8harder@xlr8harder#1674inAI

I ran a quick 32k context extension on Talkie 1930 using pre-1931 books and YaRN. It's probably possible to squeeze some more long context performance from Talkie, but for two afternoons on a single A100 node I think this is reasonable, given where the model started.

10:51 AM · Jun 1, 2026 · 2.9K Views
Sentiment
Sentiment unavailable for this story.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
VIEWS1.3KLIKES8REPLIES2
Cody Blakeney@code_star

Pretty impressive it was able to answer questions about @paulg s sandwiches with only pre-1931 knowledge.

xlr8harder@xlr8harder

I ran a quick 32k context extension on Talkie 1930 using pre-1931 books and YaRN. It's probably possible to squeeze some more long context performance from Talkie, but for two afternoons on a single A100 node I think this is reasonable, given where the model started.

5hViews 1.3KLikes 8Bookmarks 0