Object detection using Keras : simple way for faster R-CNN or YOLO

A. Attia picture A. Attia · Jun 22, 2017 · Viewed 14k times · Source

This question has maybe been answered but I didn't find a simple answer to this. I created a convnet using Keras to classify The Simpsons characters (dataset here).
I have 20 classes and giving an image as input, I return the character name. It's pretty simple. My dataset contains pictures with the main character in the picture and only have the name of the character as a label.

Now I would like to add an object detection ask i.e draw a bounding box around characters in the picture and predict which character it is. I don't want to use a sliding window because it's really slow. So I thought about using faster RCNN (github repo) or YOLO (github repo). Should I have to add the coordinates of the bounding box for each picture of my training set? Is there a way to do object detection (and get bounding boxes in my test) without giving the coordinates for the training set?

In sum, I would like to create a simple object detection model, I don't know if it's possible to create a simpler YOLO or Faster RCNN.

Thank you very much for any help.

Answer

Andrew Tu picture Andrew Tu · Aug 9, 2017

The goal of yolo or faster rcnn is to get the bounding boxes. So in short, yes you will need to label the data to train it.

Take a shortcut:

  • 1) Label a handful of bounding boxes for (lets say 5 per character).
  • 2) Train faster rcnn or yolo on the very small dataset.
  • 3) Run your model against the full dataset
  • 4) It will get some right, get alot of it wrong.
  • 5) Train the faster rcnn on the ones that are correctly bounded, your training set should be much bigger now.
  • 6) repeat until you have your desired result.