Tensorflow TFRecord: Can't parse serialized example

Stewart_R picture Stewart_R · Nov 27, 2018 · Viewed 14.8k times · Source

I am trying to follow this guide in order to serialize my input data into the TFRecord format but I keep hitting this error when trying to read it:

InvalidArgumentError: Key: my_key. Can't parse serialized Example.

I am not sure where I'm going wrong. Here is a minimal reproduction of the issue I cannot get past.

Serialise some sample data:

with tf.python_io.TFRecordWriter('train.tfrecords') as writer:
  for idx in range(10):
        example = tf.train.Example(
            features=tf.train.Features(
                feature={
                    'label': tf.train.Feature(int64_list=tf.train.Int64List(value=[1,2,3])),
                    'test': tf.train.Feature(float_list=tf.train.FloatList(value=[0.1,0.2,0.3])) 
                }
            )
        )

        writer.write(example.SerializeToString())
  writer.close()

Parsing function & deserialise:

def parse(tfrecord):
  features = {
      'label': tf.FixedLenFeature([], tf.int64, default_value=0),
      'test': tf.FixedLenFeature([], tf.float32, default_value=0.0),
  }
  return tf.parse_single_example(tfrecord, features)

dataset = tf.data.TFRecordDataset('train.tfrecords').map(parse)
getnext = dataset.make_one_shot_iterator().get_next()

When trying to run this:

with tf.Session() as sess:
  v = sess.run(getnext)
  print (v)

I trigger the above error message.

Is it possible to get past this error and deserialize my data?

Answer

Vlad-HC picture Vlad-HC · Nov 27, 2018

tf.FixedLenFeature() is used for reading the fixed size arrays of data. And the shape of the data should be defined beforehand. Updating the parse function to

def parse(tfrecord):
   return tf.parse_single_example(tfrecord, features={
       'label': tf.FixedLenFeature([3], tf.int64, default_value=[0,0,0]),
       'test': tf.FixedLenFeature([3], tf.float32, default_value=[0.0, 0.0, 0.0]),
   })

Should do the job.