Tensorflow MomentumOptimizer issue

Afshin Oroojlooy picture Afshin Oroojlooy · Aug 22, 2016 · Viewed 7.7k times · Source

When I run the following simple code, I get this error:

tensorflow.python.framework.errors.FailedPreconditionError: Attempting to use uninitialized value Variable_5/Momentum

This code works with GradientDescentOptimizer, but I have the error with MomentumOptimizer. Please guide me to solve it.

Here is my code:

import tensorflow as tf
import numpy as np
import scipy.io as sio
import h5py
from tensorflow.python.training import queue_runner

maxiter = 200000
display = 1
sess = tf.InteractiveSession()
decay_rate = 0.00005
starter_learning_rate = 0.000009
alpha = 0.00005
init_momentum = 0.9

nnodes1 = 350
nnodes2 = 100
batch_size = 50

train_mat = h5py.File('Basket_train_data_binary.mat')
test_mat = h5py.File('Basket_test_data_binary.mat')

train_mat = train_mat["binary_train"].value
test_mat = test_mat["binary_test"].value

Train = np.transpose(train_mat)
Test = np.transpose(test_mat)

# import the data                                                                                                                                
#from tensorflow.examples.tutorials.mnist import input_data                                                                                      
# placeholders, which are the training data                                                                                                      
x = tf.placeholder(tf.float32, shape=[None,43])
y_ = tf.placeholder(tf.float32, shape=[None])

# define the variables                                                                                                                                                                                                                                                                        
W1 = tf.Variable(tf.zeros([43,nnodes1]))

b1 = tf.Variable(tf.zeros([nnodes1]))

W2 = tf.Variable(tf.zeros([nnodes1,nnodes2]))
b2 = tf.Variable(tf.zeros([nnodes2]))

W3 = tf.Variable(tf.zeros([nnodes2,1]))
b3 = tf.Variable(tf.zeros([1]))

# Passing global_step to minimize() will increment it at each step.                                                                              
global_step = tf.Variable(0, trainable=False)
momentum = tf.Variable(init_momentum, trainable=False)


# initilize the variables                                                                                                                       
sess.run(tf.initialize_all_variables())

# prediction function (just one layer)                                                                                                         

layer1 = tf.nn.sigmoid(tf.matmul(x,W1) + b1)
layer2 = tf.nn.sigmoid(tf.matmul(layer1,W2) + b2)
y = tf.matmul(layer2,W3) + b3

# cost function                                                                                                                                  
cost_function = tf.reduce_sum(tf.square(y_ - y))

l2regularization = tf.reduce_sum(tf.square(W1)) + tf.reduce_sum(tf.square(b1)) + tf.reduce_sum(tf.square(W2)) + tf.reduce_sum(tf.square(b2)) + tf.reduce_sum(tf.square(W3)) + tf.reduce_sum(tf.square(b3))
loss = cost_function + alpha*l2regularization

# define the learning_rate and its decaying procedure.                                                                                           
learning_rate = tf.train.exponential_decay(starter_learning_rate,     global_step,10000, decay_rate, staircase=True)
# define the training paramters and model, gradient model and feeding the function                                                               
#train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
train_step = tf.train.MomentumOptimizer(learning_rate,0.9).minimize(loss, global_step=global_step)                                              


# evaluation                                                                                                                                     
# it returns 1, if both y and y_ are equal.                                                                                                      
correct_prediction = tf.reduce_sum(tf.square(y_ - y))

# calculate the accuracy                                                                                                                         
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


# Train the Model for 1000 times. by defining the batch number we     determine that it is sgd                                                       
for i in range(maxiter):
  batch = np.random.randint(0,len(Train),size=batch_size)
  train_step.run(feed_dict={x:Train[batch,0:43], y_:Train[batch,43]})
  if np.mod(i,display) == 0:
    # print tset loss                                                                                                                            
    print "Test", accuracy.eval(feed_dict={x: Test[:,0:43], y_: Test[:,43]})
    # print training loss                                                                                                                        
    print "Train" , sess.run(cost_function,feed_dict={x:     Train[:,0:43], y_: Train[:,43]})

Please guide me how I can solve this problem. Thanks in advance, Afshin

Answer

nessuno picture nessuno · Aug 23, 2016

At the line

# initilize the variables                                                                                                                       
sess.run(tf.initialize_all_variables())

you're initializing all the variables declared before that line.

You declared the optimizer (and other variables) after that line, so the variables used by the optimizer are not affected by the initialization.

Move the initialization after the complete graph declaration (eg: after the declaration of every variable and op) to fix.

TL;DR: move

# initilize the variables                                                                                                                       
sess.run(tf.initialize_all_variables())

after

# calculate the accuracy                                                                                                                         
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))