Keras custom loss as a function of multiple outputs

defqoon picture defqoon · Aug 4, 2018 · Viewed 9.2k times · Source

I built a custom architecture with keras (a convnet). The network has 4 heads, each outputting a tensor of different size. I am trying to write a custom loss function as a function of this 4 outputs. I have been implementing cusutom losses before, but it was either a different loss for each head or the same loss for each head. In this case, I need to combine the 4 outputs to calculate the loss.

I am used to the following:

def custom_loss(y_true, y_pred):
    return something
model.compile(optimizer, loss=custom_loss)

but in my case, I would need y_pred to be a list of the 4 outputs. I can pad the outputs with zeros and add a concatenate layer in my model, but I was wondering if there was an easier way around.

Edit

My loss function is rather complex, can I write something like:

model.add_loss(custom_loss(input1, input2, output1, output2))

where custom loss is defined as:

def custom_loss(input1, input2, output1, output2):
    return loss

Answer

sdcbr picture sdcbr · Aug 4, 2018

You could try the model.add_loss() function. The idea is to construct your custom loss as a tensor instead of a function, add it to the model, and compile the model without further specifying a loss. See also this implementation of a variational autoencoder where a similar idea is used.

Example:

import keras.backend as K
from keras.layers import Input, Dense
from keras.models import Model
from keras.losses import mse
import numpy as np

# Some random training data
features = np.random.rand(100,20)
labels_1 = np.random.rand(100,4)
labels_2 = np.random.rand(100,1)

# Input layer, one hidden layer
input_layer = Input((20,))
dense_1 = Dense(128)(input_layer)

# Two outputs
output_1 = Dense(4)(dense_1)
output_2 = Dense(1)(dense_1)

# Two additional 'inputs' for the labels
label_layer_1 = Input((4,))
label_layer_2 = Input((1,))

# Instantiate model, pass label layers as inputs
model = Model(inputs=[input_layer, label_layer_1, label_layer_2], outputs=[output_1, output_2])

# Construct your custom loss as a tensor
loss = K.mean(mse(output_1, label_layer_1) * mse(output_2, label_layer_2))

# Add loss to model
model.add_loss(loss)

# Compile without specifying a loss
model.compile(optimizer='sgd')

dummy = np.zeros((100,))
model.fit([features, labels_1, labels_2], dummy, epochs=2)