
Fun things we kept seeing:
right answer, wrong mechanism (great prediction, wrong causal graph) give the agent the perfect experiments and its structure recovery gets worse — discovery is in choosing them, not the data it stops experimenting way too early




