I started doing language modeling research in Jan 2016, using the LSTM from the "RNN Regularization" model by @woj_zaremba @OriolVinyalsML and @ilyasut, trying to improve its perplexity on Penn Treebank (1M toks). This led to the weight tying method, sometimes still used today.
It's been ten years since @OfirPress first wrote on how to get started in deep learning -- and most of what he wrote is still relevant today!
Congratulations Ofir!👏👏👏


