How do I get TensorFlow example queues into proper batches for training?
I've got some images and labels:
IMG_6642.JPG 1
IMG_6643.JPG 2
(feel free to suggest another label format; I think I may need another dense to sparse step...)
I've read through quite a few tutorials but don't quite have it all together yet. Here's what I have, with comments indicating the steps required from TensorFlow's Reading Data page.
And after the example queue I need to get this queue into batches for training; that's where I'm stuck...
1. List of filenames
files = tf.train.match_filenames_once('*.JPG')
4. Filename queue
filename_queue = tf.train.string_input_producer(files, num_epochs=None, shuffle=True, seed=None, shared_name=None, name=None)
5. A reader
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
6. A decoder
record_defaults = [[""], [1]]
col1, col2 = tf.decode_csv(value, record_defaults=record_defaults)
(I don't think I need this step below because I already have my label in a tensor but I include it anyways)
features = tf.pack([col2])
The documentation page has an example to run one image, not get the images and labels into batches:
for i in range(1200):
# Retrieve a single instance:
example, label = sess.run([features, col5])
And then below it has a batching section:
def read_my_file_format(filename_queue):
reader = tf.SomeReader()
key, record_string = reader.read(filename_queue)
example, label = tf.some_decoder(record_string)
processed_example = some_processing(example)
return processed_example, label
def input_pipeline(filenames, batch_size, num_epochs=None):
filename_queue = tf.train.string_input_producer(
filenames, num_epochs=num_epochs, shuffle=True)
example, label = read_my_file_format(filename_queue)
# min_after_dequeue defines how big a buffer we will randomly sample
# from -- bigger means better shuffling but slower start up and more
# memory used.
# capacity must be larger than min_after_dequeue and the amount larger
# determines the maximum we will prefetch. Recommendation:
# min_after_dequeue + (num_threads + a small safety margin) * batch_size
min_after_dequeue = 10000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch(
[example, label], batch_size=batch_size, capacity=capacity,
min_after_dequeue=min_after_dequeue)
return example_batch, label_batch
My question is: how do I use the above example code with the code I have above? I need batches to work with, and most of the tutorials come with mnist batches already.
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(training_epochs):
total_batch = int(mnist.train.num_examples/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
If you wish to make this input pipeline work, you will need add an asynchronous queue'ing mechanism that generate batches of examples. This is performed by creating a tf.RandomShuffleQueue
or a tf.FIFOQueue
and inserting JPEG images that have been read, decoded and preprocessed.
You can use handy constructs that will generate the Queues and the corresponding threads for running the queues via tf.train.shuffle_batch_join
or tf.train.batch_join
. Here is a simplified example of what this would like. Note that this code is untested:
# Let's assume there is a Queue that maintains a list of all filenames
# called 'filename_queue'
_, file_buffer = reader.read(filename_queue)
# Decode the JPEG images
images = []
image = decode_jpeg(file_buffer)
# Generate batches of images of this size.
batch_size = 32
# Depends on the number of files and the training speed.
min_queue_examples = batch_size * 100
images_batch = tf.train.shuffle_batch_join(
image,
batch_size=batch_size,
capacity=min_queue_examples + 3 * batch_size,
min_after_dequeue=min_queue_examples)
# Run your network on this batch of images.
predictions = my_inference(images_batch)
Depending on how you need to scale up your job, you might need to run multiple independent threads that read/decode/preprocess images and dump them in your example queue. A complete example of such a pipeline is provided in the Inception/ImageNet model. Take a look at batch_inputs
:
https://github.com/tensorflow/models/blob/master/inception/inception/image_processing.py#L407
Finally, if you are working with >O(1000) JPEG images, keep in mind that it is extremely inefficient to individually ready 1000's of small files. This will slow down your training quite a bit.
A more robust and faster solution to convert a dataset of images to a sharded TFRecord
of Example
protos. Here is a fully worked script for converting the ImageNet data set to such a format. And here is a set of instructions for running a generic version of this preprocessing script on an arbitrary directory containing JPEG images.