I need some help with keras loss function. I have been implementing custom loss function on keras with Tensorflow backend.
I have implemented the custom loss function in numpy but it would be great if it could be translated into keras loss function. The loss function takes dataframe and series of user id. The Euclidean distance for same user_id are positive and negative if the user_id are different. The function returns summed up scalar distance of the dataframe.
def custom_loss_numpy (encodings, user_id):
# user_id: a pandas series of users
# encodings: a pandas dataframe of encodings
batch_dist = 0
for i in range(len(user_id)):
first_row = encodings.iloc[i,:].values
first_user = user_id[i]
for j in range(i+1, len(user_id)):
second_user = user_id[j]
second_row = encodings.iloc[j,:].values
# compute distance: if the users are same then Euclidean distance is positive otherwise negative.
if first_user == second_user:
tmp_dist = np.linalg.norm(first_row - second_row)
else:
tmp_dist = -np.linalg.norm(first_row - second_row)
batch_dist += tmp_dist
return batch_dist
I have tried to implement into keras loss function. I extracted numpy array from y_true and y_pred tensor objects.
def custom_loss_keras(y_true, y_pred):
# session of my program
sess = tf_session.TF_Session().get()
with sess.as_default():
array_pred = y_pred.eval()
print(array_pred)
But I get the following error.
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'dense_1_input' with dtype float and shape [?,102]
[[Node: dense_1_input = Placeholder[dtype=DT_FLOAT, shape=[?,102], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Any kind of help would be really appreciated.
First of all, it is not possible to "extract numpy array from y_true
and y_pred
" in Keras loss functions. You have to operate the tensors with Keras backend functions (or TF functions) to calculate the loss.
In other words, it would be better to think about a "vectorized" way to calculate the loss, without using if-else and loops.
Your loss function can be computed in the following steps:
encodings
.I
whose element I_ij
is 1 if user_i == user_j
, and -1 if user_i != user_j
.Here's an implementation:
def custom_loss_keras(user_id, encodings):
# calculate pairwise Euclidean distance matrix
pairwise_diff = K.expand_dims(encodings, 0) - K.expand_dims(encodings, 1)
pairwise_squared_distance = K.sum(K.square(pairwise_diff), axis=-1)
# add a small number before taking K.sqrt for numerical safety
# (K.sqrt(0) sometimes becomes nan)
pairwise_distance = K.sqrt(pairwise_squared_distance + K.epsilon())
# this will be a pairwise matrix of True and False, with shape (batch_size, batch_size)
pairwise_equal = K.equal(K.expand_dims(user_id, 0), K.expand_dims(user_id, 1))
# convert True and False to 1 and -1
pos_neg = K.cast(pairwise_equal, K.floatx()) * 2 - 1
# divide by 2 to match the output of `custom_loss_numpy`, but it's not really necessary
return K.sum(pairwise_distance * pos_neg, axis=-1) / 2
I've assumed that user_id
are integers in the code above. The trick here is to use K.expand_dims
for implementing pairwise operations. It's probably a bit difficult to understand at a first glance, but it's quite useful.
It should give about the same loss value as custom_loss_numpy
(there will be a little bit difference because of K.epsilon()
):
encodings = np.random.rand(32, 10)
user_id = np.random.randint(10, size=32)
print(K.eval(custom_loss_keras(K.variable(user_id), K.variable(encodings))).sum())
-478.4245
print(custom_loss_numpy(pd.DataFrame(encodings), pd.Series(user_id)))
-478.42953553795815
I've made a mistake in the loss function.
When this function is used in training, since Keras automatically changes y_true
to be at least 2D, the argument user_id
is no longer a 1D tensor. The shape of it will be (batch_size, 1)
.
In order to use this function, the extra axis must be removed:
def custom_loss_keras(user_id, encodings):
pairwise_diff = K.expand_dims(encodings, 0) - K.expand_dims(encodings, 1)
pairwise_squared_distance = K.sum(K.square(pairwise_diff), axis=-1)
pairwise_distance = K.sqrt(pairwise_squared_distance + K.epsilon())
user_id = K.squeeze(user_id, axis=1) # remove the axis added by Keras
pairwise_equal = K.equal(K.expand_dims(user_id, 0), K.expand_dims(user_id, 1))
pos_neg = K.cast(pairwise_equal, K.floatx()) * 2 - 1
return K.sum(pairwise_distance * pos_neg, axis=-1) / 2