1d agoSéb Krier, Google DeepMind AGI policy lead, requests research clarifying how LLM pre-training loss predicts scaling behaviorGavin Leech highlighted Zeyuan Allen-Zhu's research on training mechanics.SentimentSentimentPos100%Neg0%Users praised related research by Allen-Zhu on how pre-training loss outperforms model size for predicting LLM scaling.1 comment with sentiment. View comments.