What is Vanishing and exploding gradient descent?

Vanishing and exploding gradient descent is a type of optimization algorithm used in deep learning.

Vanishing Gradient

Vanishing Gradient occurs when the gradient is smaller than expected. It causes the earlier layers to start degrading before the later ones do, causing a decrease in the overall learning rate of that subset of layers. The weights of the previous nodes will update until they’re no longer significant.

The vanishing gradient is a problem in neural networks that occurs because the values for the weights of the network and the derivatives of these weights are very small, which leads to a steep learning rate.

Exploding gradient

Exploding gradient can happen when a gradient is too big and creates an unstable model. This means the weight of the model will be large and they may appear as “NaN”. This can be solved using the dimensionality reduction technique which helps to minimize complexity.

Author

Naveen

Naveen Pandey has more than 2 years of experience in data science and machine learning. He is an experienced Machine Learning Engineer with a strong background in data analysis, natural language processing, and machine learning. Holding a Bachelor of Science in Information Technology from Sikkim Manipal University, he excels in leveraging cutting-edge technologies such as Large Language Models (LLMs), TensorFlow, PyTorch, and Hugging Face to develop innovative solutions.
View all posts

Spread the knowledge

Nomidl