/Tech2h ago

OpenAI Vanilla Prompt Solves Research Math 10-40x Cheaper Than Custom Academic Prompts

510310507.2K
Original post
Sanjeev Arora@prfsanjeevarora#121inTech

Sobering take-away from 1stproof (round 2) https://1stproof.org/. OpenAI's vanilla prompt to 5.5pro https://tinyurl.com/yc8ymuna solves research math 10-40 x cheaper than custom prompts from academic teams. We used Gemini pro. Switching to 5.5pro improves results a lot but costs rise to the level of other academic pipelines :(

7:21 AM · Jun 11, 2026 · 5.9K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS931BOOKMARKS2LIKES5
Sanjeev Arora@prfsanjeevarora

During the official evaluation our pipeline also seemed to have had some timeout error on several questions (a default "Disclaimer" line with some brief report by the orchestrator). This was unfortunate, especially since it happened on several of the easier problems

Sanjeev Arora@prfsanjeevarora

Sobering take-away from 1stproof (round 2) https://1stproof.org/. OpenAI's vanilla prompt to 5.5pro https://tinyurl.com/yc8ymuna solves research math 10-40 x cheaper than custom prompts from academic teams. We used Gemini pro. Switching to 5.5pro improves results a lot but costs rise to the level of other academic pipelines :(

2hViews 931Likes 5Bookmarks 2

@prfsanjeevarora Hard to figure out what to do about the bitter lesson. I think it is going to be hard for researchers to succeed long term with any task that can be framed as a competition, because it will be so natural for the labs to train on it themselves. Need to find a complement to LLMs

2hViews 64