/AI5h ago

AI to Enable Formal Proofs for Compilers, Linux, and Chips by 2027

66510253.7K
Original post
Geoffrey Irving@geoffreyirving#349inAI

New paper with Gopal Sarma, Rachel Steratore, and Sunny Bhatt, and me surveying formal methods folk about importance and tractability of applications to AI safety. I'm excited this is out!

Here is a broader plea for people to be very ambitious about verifying software! 馃У

2:46 AM 路 Jun 8, 2026 路 2.5K Views
Sentiment

Users in the replies highlight the value of formal methods for building correct compilers and memory-safe systems, viewing these as major, defense-dominant bets worth pursuing for AI safety.

Pos
100.0%
Neg
0.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS525REPLIES2
Geoffrey Irving@geoffreyirving

AI-assisted formal proofs (in particular in Lean) are getting very good! A worry I have is that people will insufficiently update about how powerful this stuff can be, and thus fail to tackle sufficiently big projects.

https://www.rand.org/pubs/research_reports/RRA4881-1.html

Geoffrey Irving@geoffreyirving

New paper with Gopal Sarma, Rachel Steratore, and Sunny Bhatt, and me surveying formal methods folk about importance and tractability of applications to AI safety. I'm excited this is out!

Here is a broader plea for people to be very ambitious about verifying software! 馃У

5hViews 525Likes 5Bookmarks 2
BOOKMARKS2LIKES12RETWEETS4
Geoffrey Irving@geoffreyirving

So, here are some predictions! By the end of 2027, we will have formal proofs* of all of

1. The correctness of clang and gcc 2. Lack of memory errors in Linux 3. Internally, within at least one major hardware company (Intel, Apple, or Nvidia, say), correctness of an entire chip

Geoffrey Irving@geoffreyirving

AI-assisted formal proofs (in particular in Lean) are getting very good! A worry I have is that people will insufficiently update about how powerful this stuff can be, and thus fail to tackle sufficiently big projects.

https://www.rand.org/pubs/research_reports/RRA4881-1.html

5hViews 413Likes 12Bookmarks 2
Geoffrey Irving@geoffreyirving

*Of course, these proofs will be against specs, and there is insufficient time to write these specs via human effort alone. So the specs will be partially AI generated, and thus provide only partial confidence about actual correctness of the software.

Geoffrey Irving@geoffreyirving

So, here are some predictions! By the end of 2027, we will have formal proofs* of all of

1. The correctness of clang and gcc 2. Lack of memory errors in Linux 3. Internally, within at least one major hardware company (Intel, Apple, or Nvidia, say), correctness of an entire chip

5hViews 106Likes 4Bookmarks 0
Geoffrey Irving@geoffreyirving

A further subtlety is that why I didn't say "correctness of Linux". Two more detailed claims:

1. Linux is sparsely close to a version of Linux with no memory errors 2. Linux is sparsely far from any versions with no denial of service attacks

Geoffrey Irving@geoffreyirving

*Of course, these proofs will be against specs, and there is insufficient time to write these specs via human effort alone. So the specs will be partially AI generated, and thus provide only partial confidence about actual correctness of the software.

5hViews 92Likes 3Bookmarks 0
Geoffrey Irving@geoffreyirving

That is, correctness of an OS against certain bugs is unachievable without big switch to something like seL4 or the like (designed from the ground up to be much harder). For memory errors, we can find and fix them locally during proof construction; DDOS is more global.

5hViews 52Likes 1
Geoffrey Irving@geoffreyirving

Correct clang + correct gcc + memory-safe Linux would be very big! And there are certainly many other huge bets that would be valuable to take! Formal methods is beautifully defence-dominant; let's scale it up!

5hViews 119
Geoffrey Irving@geoffreyirving

Oops, failed to specifically probabilities on those predictions: I would put >80% on each of 1, 2, 3.

5hViews 96Likes 1