13h ago

Andrew B. Hall, Stanford GSB professor, ran a classroom experiment where students with no coding experience built custom AI evaluations using Claude Code

Demonstrates alternative to university AI bans via personal evaluations.

2
Original post

My new piece: instead of banning AI in teaching, we need to create an army of citizens who've learned how to build their own personal evals measuring whether AI fits their values. A new kind of distributed system that holds AI accountable to each of us. This is what I've experimented with in the classroom @StanfordGSB this quarter. My students have gone from no experience with code to building their own evals for AI using Claude Code. Every student designed their evals, got results, and created a leaderboard in a single three-hour class session with no structure or help. These ranged from studying how AI handles Brazilian elections or Burmese translation to how it solves logic puzzles and the extent to which it sticks to consequentialist philosophical values. It is mindboggling what it's possible to do in the classroom with AI now. My argument: every new technology raises concerns about how to update the way we teach and learn. Old wisdom from Aristotle to Bacon to Tocqueville to Dewey argues that the best way to learn is by *doing*. AI gives us new ways to learn by doing, and we need to embrace these as part of our toolkit. By building evals, students don't just gain experience managing coding agents, which will be essential to their post-college lives. --They turn AI into an object of study rather than a tool that passively guides them. --They get to engage their curiosity and their personal interests. --They experience what it will be like to be a member of a new kind of democratic society in which helping to hold AI systems accountable will be key --They have fun! There's a lot of pessimism about AI and the teaching experience right now, but this experiment has given me some reasons for optimism. Check out all the projects the students came up with, and more about the experiment and my argument, in the post here: https://freesystems.substack.com/p/an-army-of-citizens-building-evals

8:45 AM · May 21, 2026 View on X
Reposted by