I'm trying to implement a Siamese Neural Network in TensorFlow but I cannot really find any working example on the Internet (see Yann LeCun paper).
The architecture I'm trying to build would consist of two LSTMs sharing weights and only connected at the end of the network.
My question is: how to build two different neural networks sharing their weights (tied weights) in TensorFlow and how to connect them at the end?
Thanks :)
Edit: I implemented a simple and working example of a siamese network here on MNIST.
tf.layers
If you use the tf.layers
module to build your network, you can simply use the argument reuse=True
for the second part of the Siamese network:
x = tf.ones((1, 3))
y1 = tf.layers.dense(x, 4, name='h1')
y2 = tf.layers.dense(x, 4, name='h1', reuse=True)
# y1 and y2 will evaluate to the same values
sess = tf.Session()
sess.run(tf.global_variables_initializer())
print(sess.run(y1))
print(sess.run(y2)) # both prints will return the same values
tf.get_variable
You can try using the function tf.get_variable()
. (See the tutorial)
Implement the first network using a variable scope with reuse=False
:
with tf.variable_scope('Inference', reuse=False):
weights_1 = tf.get_variable('weights', shape=[1, 1],
initializer=...)
output_1 = weights_1 * input_1
Then implement the second with the same code except using reuse=True
with tf.variable_scope('Inference', reuse=True):
weights_2 = tf.get_variable('weights')
output_2 = weights_2 * input_2
The first implementation will create and initialize every variable of the LSTM, whereas the second implementation will use tf.get_variable()
to get the same variables used in the first network. That way, variables will be shared.
Then you just have to use whatever loss you want (e.g. you can use the L2 distance between the two siamese networks), and the gradients will backpropagate through both networks, updating the shared variables with the sum of the gradients.