A guide to convert_imageset.cpp

pwhc picture pwhc · Jul 15, 2015 · Viewed 26.3k times · Source

I am relatively new to machine learning/python/ubuntu.

I have a set of images in .jpg format where half contain a feature I want caffe to learn and half don't. I'm having trouble in finding a way to convert them to the required lmdb format.

I have the necessary text input files.

My question is can anyone provide a step by step guide on how to use convert_imageset.cpp in the ubuntu terminal?

Thanks

Answer

Shai picture Shai · Jul 15, 2015

A quick guide to Caffe's convert_imageset

Build

First thing you must do is build caffe and caffe's tools (convert_imageset is one of these tools).
After installing caffe and makeing it make sure you ran make tools as well.
Verify that a binary file convert_imageset is created in $CAFFE_ROOT/build/tools.

Prepare your data

Images: put all images in a folder (I'll call it here /path/to/jpegs/).
Labels: create a text file (e.g., /path/to/labels/train.txt) with a line per input image . For example:

img_0000.jpeg 1
img_0001.jpeg 0
img_0002.jpeg 0

In this example the first image is labeled 1 while the other two are labeled 0.

Convert the dataset

Run the binary in shell

~$ GLOG_logtostderr=1 $CAFFE_ROOT/build/tools/convert_imageset \
    --resize_height=200 --resize_width=200 --shuffle  \
    /path/to/jpegs/ \
    /path/to/labels/train.txt \
    /path/to/lmdb/train_lmdb

Command line explained:

  • GLOG_logtostderr flag is set to 1 before calling convert_imageset indicates the logging mechanism to redirect log messages to stderr.
  • --resize_height and --resize_width resize all input images to same size 200x200.
  • --shuffle randomly change the order of images and does not preserve the order in the /path/to/labels/train.txt file.
  • Following are the path to the images folder, the labels text file and the output name. Note that the output name should not exist prior to calling convert_imageset otherwise you'll get a scary error message.

Other flags that might be useful:

  • --backend - allows you to choose between an lmdb dataset or levelDB.
  • --gray - convert all images to gray scale.
  • --encoded and --encoded_type - keep image data in encoded (jpg/png) compressed form in the database.
  • --help - shows some help, see all relevant flags under Flags from tools/convert_imageset.cpp

You can check out $CAFFE_ROOT/examples/imagenet/convert_imagenet.sh for an example how to use convert_imageset.