Gary Gao

Optimization

The next step in training a model would be optimization. This is the process of adjusting the model's trainable parameters to minimize the loss.

A common way to do this is through gradient descent, where the model minimizes the loss by calculating the gradient of the loss function at each step and then moving in the opposite direction of that gradient. Learn more about gradient descent here: [[Gradient-Descent and SGD]].

The model calculates the gradient of the loss function with respect to the parameters through a process called backpropagation. Learn more about backpropagation here: [[Backpropagation]].

A historic problem tied with backpropagation is the vanishing gradient problem. Learn more about the vanishing gradient problem here: [[The Vanishing Gradient Problem]]