The capacity is essentially the model's ability to capture patterns. A right capacity can prevent the model from overfitting or underfitting.
A capacity that is too high can cause overfitting. When the amount of data is insufficient compared to the model's high capacity, the model will begin to learn the specific characteristic of the training data - leading to a loss of generalizability.
A capacity that is too low can cause underfitting. When the capacity is too low, the model cannot properly fit the data, therefore leading to a very high loss.