CrystalReasoner Uses LLMs And RL To Generate Precise Crystal Structures
Turns out constructing the right thinking traces was key. We used robocrystallographer to construct thinking traces containing information about crystallographic symmetry, local coordination, functional properties, and thermodynamic status, before generating the final CIF file.

In CrystalReasoner, we ask the simple question: can we enable structure generation with LLMs without loosing the precision and 3D knowledge (something LLMs tend to struggle with), while utilizing the reasoning ability of LLMs?
We found the combination of thinking tokens and RL with verifiable reward (e.g., validity) achieved the best performance in generating stable, unique, and novel structures, significantly outperforming previous CrystalTextLLM baselines.
Turns out constructing the right thinking traces was key. We used robocrystallographer to construct thinking traces containing information about crystallographic symmetry, local coordination, functional properties, and thermodynamic status, before generating the final CIF file.
As one might expect, we observe that more atoms require longer thinking traces, indicating that CrystalReasoner can perform adaptive reasoning according to the complexity of the generation task.

We found the combination of thinking tokens and RL with verifiable reward (e.g., validity) achieved the best performance in generating stable, unique, and novel structures, significantly outperforming previous CrystalTextLLM baselines.
To enable property-conditioned generation that is general enough to work for any properties (e.g., elastic properties, thermal expansions), we can design a general reward function by assigning positive reward to structures with properties falling in the specified range.

As one might expect, we observe that more atoms require longer thinking traces, indicating that CrystalReasoner can perform adaptive reasoning according to the complexity of the generation task.
Great work led by @yy_wu36197 in collaboration with @FallettaStefano and Delia McGrath.
To enable property-conditioned generation that is general enough to work for any properties (e.g., elastic properties, thermal expansions), we can design a general reward function by assigning positive reward to structures with properties falling in the specified range.

