Gary Gao

The first step in training a model would be choosing the correct loss function.

For regression tasks, the Mean Squared Error (MSE) is often used. The MSE essentially calculates the loss by summing up the square difference between the predicted value and the ground truth. Learn more about the MSE here: [[Mean Squared Error]].

For density modeling, the loss function is simply the opposite of the sum of the log-probabilities. For example, a high log probability set would correlated with the model thinking that the data is very likely. Therefore, it would have a very small loss.

For classification tasks, the Cross Entropy Loss is usually used. Learn more about the Cross Entropy Loss here: [[Cross Entropy Loss]].

The Contrastive Loss is often used in setups where the goal is to measure the similarities and differences between data samples. Learn more about the Contrastive Loss here: [[Contrastive Loss]].

It is important to know that loss is used in training as a simplified proxy of the true performance of the model. It is implemented primarily to make the process of optimization easier. The true performance of a model should be measured with other metrics.