How to locate multiple objects in the same image?

user3425890 picture user3425890 · Feb 21, 2017 · Viewed 13.8k times · Source

I am a newbie in TensorFlow.

Currently, I am testing some classification's examples "Convolutional Neural Network" in the TensorFlow website, and it explains how to classify input images into pre-defined classes, but the problem is: I can't figure out how to locate multiple objects in the same image. For example, I had an input image with a cat and dog and I want my graph to display in the output that there are both of them "a cat and a dog" in the image.

Answer

rmeertens picture rmeertens · Jul 18, 2017

Great question. Detecting multiple objects in the same image boils is essentially a "segmentation problem". Two nice and popular algorithms are YOLO (You Only Look Once), and SSD(Single Shot Multibox Detector). I included links to them at the bottom.

I would watch a few videos on how YOLO works, and see if you grasp the idea. Then read the paper on SSD, and see if you get why this algorithm is even faster and more precise.

Both algorithms are single-pass: they only look at the image "once" and predict bounding boxes for the categories they spot. There are more precise algorithms, but they are slower (they first pick many spots they want to look, and then run a classifier on only that spot. The result is that they run this classifier many times per image, which is slow).

As you stated you are a newbie to Tensorflow, you can try this code other people made: https://github.com/thtrieu/darkflow . The very extensive readme shows you how to get started on your own dataset.

Good luck, and let us know if you have other questions, or if these algorithms do not fit your use-case.