I am working through the TensorFlow tutorial, which uses a "weird" format to upload the data. I would like to use the NumPy or pandas format for the data, so that I can compare it with scikit-learn results.
I get the digit recognition data from Kaggle: https://www.kaggle.com/c/digit-recognizer/data.
Here the code from the TensorFlow tutorial (which works fine):
# Stuff from tensorflow tutorial
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
Here I read the data, strip out the target variables and split the data into testing and training datasets (this all works fine):
# Read dataframe from training data
csvfile='train.csv'
from pandas import DataFrame, read_csv
df = read_csv(csvfile)
# Strip off the target data and make it a separate dataframe.
Target = df.label
del df["label"]
# Split data into training and testing sets
msk = np.random.rand(len(df)) < 0.8
dfTest = df[~msk]
TargetTest = Target[~msk]
df = df[msk]
Target = Target[msk]
# One hot encode the target
OHTarget=pd.get_dummies(Target)
OHTargetTest=pd.get_dummies(TargetTest)
Now, when I try to run the training step, I get a FailedPreconditionError
:
for i in range(100):
batch = np.array(df[i*50:i*50+50].values)
batch = np.multiply(batch, 1.0 / 255.0)
Target_batch = np.array(OHTarget[i*50:i*50+50].values)
Target_batch = np.multiply(Target_batch, 1.0 / 255.0)
train_step.run(feed_dict={x: batch, y_: Target_batch})
Here's the full error:
---------------------------------------------------------------------------
FailedPreconditionError Traceback (most recent call last)
<ipython-input-82-967faab7d494> in <module>()
4 Target_batch = np.array(OHTarget[i*50:i*50+50].values)
5 Target_batch = np.multiply(Target_batch, 1.0 / 255.0)
----> 6 train_step.run(feed_dict={x: batch, y_: Target_batch})
/Users/user32/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.pyc in run(self, feed_dict, session)
1265 none, the default session will be used.
1266 """
-> 1267 _run_using_default_session(self, feed_dict, self.graph, session)
1268
1269
/Users/user32/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.pyc in _run_using_default_session(operation, feed_dict, graph, session)
2761 "the operation's graph is different from the session's "
2762 "graph.")
-> 2763 session.run(operation, feed_dict)
2764
2765
/Users/user32/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict)
343
344 # Run request and get response.
--> 345 results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
346
347 # User may have fetched the same tensor multiple times, but we
/Users/user32/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_run(self, target_list, fetch_list, feed_dict)
417 # pylint: disable=protected-access
418 raise errors._make_specific_exception(node_def, op, e.error_message,
--> 419 e.code)
420 # pylint: enable=protected-access
421 raise e_type, e_value, e_traceback
FailedPreconditionError: Attempting to use uninitialized value Variable_1
[[Node: gradients/add_grad/Shape_1 = Shape[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](Variable_1)]]
Caused by op u'gradients/add_grad/Shape_1', defined at:
File "/Users/user32/anaconda/lib/python2.7/runpy.py", line 162, in _run_module_as_main
...........
...which was originally created as op u'add', defined at:
File "/Users/user32/anaconda/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
[elided 17 identical lines from previous traceback]
File "/Users/user32/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 3066, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-45-59183d86e462>", line 1, in <module>
y = tf.nn.softmax(tf.matmul(x,W) + b)
File "/Users/user32/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 403, in binary_op_wrapper
return func(x, y, name=name)
File "/Users/user32/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 44, in add
return _op_def_lib.apply_op("Add", x=x, y=y, name=name)
File "/Users/user32/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 633, in apply_op
op_def=op_def)
File "/Users/user32/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1710, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Users/user32/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 988, in __init__
self._traceback = _extract_stack()
Any ideas as to how I can fix this?
The FailedPreconditionError
arises because the program is attempting to read a variable (named "Variable_1"
) before it has been initialized. In TensorFlow, all variables must be explicitly initialized, by running their "initializer" operations. For convenience, you can run all of the variable initializers in the current session by executing the following statement before your training loop:
tf.initialize_all_variables().run()
Note that this answer assumes that, as in the question, you are using tf.InteractiveSession
, which allows you to run operations without specifying a session. For non-interactive uses, it is more common to use tf.Session
, and initialize as follows:
init_op = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init_op)