I'm currently learning about Convolutional Neural Networks by studying examples like the MNIST examples. During the training of a neural network, I often see output like:
Epoch | Train loss | Valid loss | Train / Val
--------|--------------|--------------|---------------
50 | 0.004756 | 0.007043 | 0.675330
100 | 0.004440 | 0.005321 | 0.834432
250 | 0.003974 | 0.003928 | 1.011598
500 | 0.002574 | 0.002347 | 1.096366
1000 | 0.001861 | 0.001613 | 1.153796
1500 | 0.001558 | 0.001372 | 1.135849
2000 | 0.001409 | 0.001230 | 1.144821
2500 | 0.001295 | 0.001146 | 1.130188
3000 | 0.001195 | 0.001087 | 1.099271
Besides the epochs, can someone give me an explanation on what exactly each column represents and what the values mean? I see a lot of tutorials on basic cnn's, but I haven't run into one that explains this in detail.
It appears that a held-out set of data is being used, in addition to that used to train the network. Training loss is the error on the training set of data. Validation loss is the error after running the validation set of data through the trained network. Train/valid is the ratio between the two.
Unexpectedly, as the epochs increase both validation and training error drop. At a certain point though, while the training error continues to drop (the network learns the data better and better) the validation error begins to rise -- this is overfitting
!