/AI11h ago

Researcher Trains Efficient 149M Model For Legal Contract Clause Extraction

5979856.4K

Original posts

Reposts

#160

Original post

Omar Khattab#160

Kevin Madura@kmad

So /goal is awesome

Over the past few weeks I used @PrimeIntellect to train a 149M late interaction model based on GTE-ModernColBERT-v1 using PyLate, focused on clause extraction from legal contracts.

On the MLEB benchmark it does well for its size: it's the best accuracy-per-parameter open model on the task, 3rd of 17 open-source models, ahead of Google's EmbeddingGemma (308M, 0.829) and the same-size legal peer Free Law ModernBERT (0.764), behind only Qwen3-Embedding-4B/8B (which are 27–53× larger).

The agents love the prime cli. I only used the UI for paying my bill.

7:54 PM · Jun 1, 2026 · 6.4K Views

/AI11h ago

Researcher Trains Efficient 149M Model For Legal Contract Clause Extraction

--0--

Original posts

Reposts

#160

Original post

Omar Khattab#160

Kevin Madura@kmad

So /goal is awesome

Over the past few weeks I used @PrimeIntellect to train a 149M late interaction model based on GTE-ModernColBERT-v1 using PyLate, focused on clause extraction from legal contracts.

The agents love the prime cli. I only used the UI for paying my bill.

7:54 PM · Jun 1, 2026 · 6.4K Views

Sentiment

Sentiment unavailable for this story.

Cluster Engagement

Sentiment

Sentiment unavailable for this story.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

No ranked X posts are available for this story yet.