Tensorflow CNN training images are all different sizes

Devin Haslam picture Devin Haslam · Dec 21, 2017 · Viewed 8.5k times · Source

I have created a Deep Convolution Neural Network to classify individual pixels in an image. My training data will always be the same size (32x32x7), but my testing data can be any size.

Github Repository

Currently, my model will only work on images that are the same size. I have used the tensorflow mnist tutorial extensively to help me construct my model. In this tutorial, we only use 28x28 images. How would the following mnist model be changed to accept images of any size?

 x = tf.placeholder(tf.float32, shape=[None, 784])
 y_ = tf.placeholder(tf.float32, shape=[None, 10])
 W = tf.Variable(tf.zeros([784,10]))
 b = tf.Variable(tf.zeros([10]))
 x_image = tf.reshape(x, [-1, 28, 28, 1])

To make things a little bit more complicated, my model has transpose convolutions where the output shape needs to be specified. How would I adjust the following line of code so that the transpose convolution will output a shape that is the same size of the input.

  DeConnv1 = tf.nn.conv3d_transpose(layer1, filter = w, output_shape = [1,32,32,7,1], strides = [1,2,2,2,1], padding = 'SAME')     

Answer

gidim picture gidim · Dec 31, 2017

Unfortunately there's no way to build dynamic graphs in Tensorflow (You could try with fold but that's outside the scope of the question). This leaves you with two options:

  1. Bucketing: You create multiple input tensors in a few hand picked sizes and then in runtime you choose the right bucket (see example). Either way you'll probably need the second option. Seq2seq with bucketing

  2. Resize the input and output images. Assuming the images all maintain the same aspect ration you can try resizing the image before inference. Not sure why you care about the output since MNIST is a classification task.

Either way you can use the same approach:

from PIL import Image

basewidth = 28 # MNIST image width
img = Image.open('your_input_img.jpg')
wpercent = (basewidth/float(img.size[0]))
hsize = int((float(img.size[1])*float(wpercent)))
img = img.resize((basewidth,hsize), Image.ANTIALIAS)

# Save image or feed directly to tensorflow 
img.save('feed_to_tf.jpg')