MIT CSAIL's Laura Ruis notes GPT-1 required only 0.96 petaflop-days of training compute on eight GPUs · Digg