I have downloaded the MNIST dataset from LeCun site. What I want is to write the Python code in order to extract the gzip and read the dataset directly from the directory, meaning that I don't have to download or access to the MNIST site anymore.
Desire process: Access folder/directory --> extract gzip --> read dataset (one hot encoding)
How to do it? Since almost all tutorials have to access to the either the LeCun or Tensoflow site to download and read the dataset. Thanks in advance!
This tensorflow call
from tensorflow.examples.tutorials.mnist import input_data
input_data.read_data_sets('my/directory')
... won't download anything it if you already have the files there.
But if for some reason you wish to unzip it yourself, here's how you do it:
from tensorflow.contrib.learn.python.learn.datasets.mnist import extract_images, extract_labels
with open('my/directory/train-images-idx3-ubyte.gz', 'rb') as f:
train_images = extract_images(f)
with open('my/directory/train-labels-idx1-ubyte.gz', 'rb') as f:
train_labels = extract_labels(f)
with open('my/directory/t10k-images-idx3-ubyte.gz', 'rb') as f:
test_images = extract_images(f)
with open('my/directory/t10k-labels-idx1-ubyte.gz', 'rb') as f:
test_labels = extract_labels(f)