Is there a way to stack two tensorflow datasets?

Kent930 picture Kent930 · Feb 13, 2018 · Viewed 12.9k times · Source

I want to stack two datasets objects in Tensorflow (rbind function in R). I have created one dataset A from tfRecord files and one dataset B from numpy arrays. Both have same variables. Do you know if there is a way to stack these two datasets to create a bigger one ? Or to create an iterrator that will randomly read data from this two sources ?

Thanks

Answer

mrry picture mrry · Feb 13, 2018

The tf.data.Dataset.concatenate() method is the closest analog of tf.stack() when working with datasets. If you have two datasets with the same structure (i.e. same types for each component, but possibly different shapes):

dataset_1 = tf.data.Dataset.range(10, 20)
dataset_2 = tf.data.Dataset.range(60, 70)

then you can concatenate them as follows:

combined_dataset = dataset_1.concatenate(dataset_2)