What is the difference between Dataset.from_tensors and Dataset.from_tensor_slices?

Llewlyn picture Llewlyn · Mar 30, 2018 · Viewed 18.3k times · Source

I have a dataset represented as a NumPy matrix of shape (num_features, num_examples) and I wish to convert it to TensorFlow type tf.Dataset.

I am struggling trying to understand the difference between these two methods: Dataset.from_tensors and Dataset.from_tensor_slices. What is the right one and why?

TensorFlow documentation (link) says that both method accept a nested structure of tensor although when using from_tensor_slices the tensor should have same size in the 0-th dimension.

Answer

MatthewScarpino picture MatthewScarpino · Mar 30, 2018

from_tensors combines the input and returns a dataset with a single element:

t = tf.constant([[1, 2], [3, 4]])
ds = tf.data.Dataset.from_tensors(t)   # [[1, 2], [3, 4]]

from_tensor_slices creates a dataset with a separate element for each row of the input tensor:

t = tf.constant([[1, 2], [3, 4]])
ds = tf.data.Dataset.from_tensor_slices(t)   # [1, 2], [3, 4]