Caffe: Reading LMDB from Python

ytrewq picture ytrewq · Oct 14, 2015 · Viewed 20k times · Source

I've extracted features using caffe, which generates a .mdb file. Then I'm trying to read it using Python and display it as a readable number.

import lmdb

lmdb_env = lmdb.open('caffefeat')
lmdb_txn = lmdb_env.begin()
lmdb_cursor = lmdb_txn.cursor()

for key, value in lmdb_cursor:
    print str(value)

This prints out a very long line of unreadable, broken characters.

Then I tried printing int(value), which returns the following:

ValueError: invalid literal for int() with base 10: '\x08\x80 \x10\x01\x18\x015\x8d\x80\xad?5'

float(value) gives the following:

ValueError: could not convert string to float:? 5????5

Is this a problem with the lmdb file itself, or does it have to do with conversion of data type?

Answer

ytrewq picture ytrewq · Oct 14, 2015

Here's the working code I figured out

import caffe
import lmdb

lmdb_env = lmdb.open('directory_containing_mdb')
lmdb_txn = lmdb_env.begin()
lmdb_cursor = lmdb_txn.cursor()
datum = caffe.proto.caffe_pb2.Datum()

for key, value in lmdb_cursor:
    datum.ParseFromString(value)
    label = datum.label
    data = caffe.io.datum_to_array(datum)
    for l, d in zip(label, data):
            print l, d