KeyError: ''val_loss" when training model

Phuc Nguyen picture Phuc Nguyen · Jul 2, 2019 · Viewed 10.5k times · Source

I am training a model with keras and am getting an error in callback in fit_generator function. I always run to epoch 3rd and get this error

annotation_path = 'train2.txt'
    log_dir = 'logs/000/'
    classes_path = 'model_data/deplao_classes.txt'
    anchors_path = 'model_data/yolo_anchors.txt'
    class_names = get_classes(classes_path)
    num_classes = len(class_names)
    anchors = get_anchors(anchors_path)

    input_shape = (416,416) # multiple of 32, hw

    is_tiny_version = len(anchors)==6 # default setting
    if is_tiny_version:
        model = create_tiny_model(input_shape, anchors, num_classes,
            freeze_body=2, weights_path='model_data/tiny_yolo_weights.h5')
    else:
        model = create_model(input_shape, anchors, num_classes,
            freeze_body=2, weights_path='model_data/yolo_weights.h5') # make sure you know what you freeze

    logging = TensorBoard(log_dir=log_dir)
    checkpoint = ModelCheckpoint(log_dir + 'ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5',
        monitor='val_loss', save_weights_only=True, save_best_only=True, period=3)

    reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1)
    early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1)


[error]
Traceback (most recent call last):
  File "train.py", line 194, in <module>
    _main()
  File "train.py", line 69, in _main
    callbacks=[logging, checkpoint])
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\engine\training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\engine\training_generator.py", line 251, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\callbacks.py", line 79, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\callbacks.py", line 429, in on_epoch_end
    filepath = self.filepath.format(epoch=epoch + 1, **logs)
KeyError: 'val_loss'

can anyone find out problem to help me?

Thanks in advance for your help.

Answer

Pedro Marques picture Pedro Marques · Jul 2, 2019

This callback runs at the end of iteration 3.

    checkpoint = ModelCheckpoint(log_dir + 'ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5',
        monitor='val_loss', save_weights_only=True, save_best_only=True, period=3)

The error message is claiming that there is no val_loss in the logs variable when executing:

filepath = self.filepath.format(epoch=epoch + 1, **logs)

This would happen if fit is called without validation_data.

I would start by simplifying the path name for model checkpoint. It is probably enough to include the epoch in the name.