/Tech12h ago

Harvey post-trains NVIDIA's Nemotron 3 Ultra in 24 hours to match Sonnet 4.6 on legal agent benchmark

The post-trained model's all-pass rate rose to 5.8%.

172102414733.8K
Original postBryan Catanzaro#455
Harvey@harvey

We partnered with @trajectorylabs to post-train NVIDIA Nemotron 3 Ultra for legal. Here’s what we found:

1) Open-weight models can reach frontier legal performance.

On our Legal Agent Benchmark (LAB), Nemotron 3 Ultra started at a 0% all-pass rate. After post-training, it reached 5.8%, placing it between Sonnet 4.6 at 4.2% and Opus 4.6 at 6.6%.

2) Post-training dramatically improves reliability.

Before training, many held-out tasks missed enough rubric dimensions to land around ~70% pass rates. After training, those tasks shifted toward ~95% pass rates.

3) Open-weight performance comes at much lower cost.

Post-trained Nemotron 3 Ultra reached a similar quality band to leading closed models while running at roughly 1/8th to 1/50th the per-token price of Sonnet 4.6 and Opus 4.6.

Most importantly: we post-trained this model on the @trajectorylabs platform less than 24 hours after Nemotron 3 Ultra launched, using the same harness, data, and recipe we used for Nemotron 3 Super.

More to come as we continue to experiment with open-weight legal agents.

Read more on post-training with Trajectory below:

Trajectory@trajectorylabs

1/ We post-trained @nvidia Nemotron 3 Ultra on @harvey Legal Agent Bench in under 24 hours.

The result: an open model reaching the same band as leading closed models on legal work, at a fraction of the cost.

The correlating story: when a new open model ships, Trajectory can turn it into a specialized agent almost immediately.

10:11 AM · Jun 10, 2026 · 33.2K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS849BOOKMARKS2LIKES4
Ben (no treats)@andersonbcdefg

RELEASE THE WEIGHTS

Trajectory@trajectorylabs

1/ We post-trained @nvidia Nemotron 3 Ultra on @harvey Legal Agent Bench in under 24 hours.

The result: an open model reaching the same band as leading closed models on legal work, at a fraction of the cost.

The correlating story: when a new open model ships, Trajectory can turn it into a specialized agent almost immediately.

4hViews 849Likes 4Bookmarks 2