Posts

Showing posts from March, 2025

Intro to Gradient Descent

Image
Credit: Medium Introduction In a Neural Network, there are several possibilities for what the output could be. For example, lets say we are training a Neural Network to guess the tone of language of a person, going from 0 (chill), 0.5 (decent mood) to 1.0 (angry). Well, we have to train the Neural Network to guess the passivity somehow! So, we assign random values to each neuron because we don't know what the function is to accurately guess somebody's tone. This means in the final/output layer, our probability distribution could look like,  This is a problem since it's just guessing and hasn't been able to come to a conclusion. So, we need something to train the Neural Network and tell it how far off it is. That's where Gradient Descent comes in.  How Does it Work?  First, we find the cost function, which is a fancy term that is to find the output value - expected output value square for all neurons summed. For our example, it would look like,  f(x) = (0.5-1.0)^2 + ...