The term for adjusting the learning rate based on the history of gradients during training is adaptive learning rate.
This is a widely used technique in optimization algorithms, particularly for training deep neural networks. By analyzing the past gradients, the algorithm can automatically adjust the learning rate for each parameter, leading to faster convergence and better performance.
Here are some popular adaptive learning rate algorithms:
- AdaGrad: This algorithm accumulates the sum of squared gradients for each parameter and uses it to scale down the learning rate.
- RMSprop: This algorithm uses an exponentially decaying average of squared gradients, which provides a more stable update than AdaGrad.
- Adam: This algorithm combines momentum with an adaptive learning rate, making it one of the most popular and effective optimization algorithms for deep learning.
Overall, adaptive learning rate techniques play a crucial role in training deep neural networks efficiently and effectively.