Momentum Gradient Cosine Similarity Turns Negative During Constant LR Training · Digg
7h
ago
Momentum Gradient Cosine Similarity Turns Negative During Constant LR Training