How to write hdf5 files without overwriting?

ilciavo picture ilciavo · Aug 9, 2015 · Viewed 8k times · Source

Sorry if this is a very basic question on h5py.

I was reading the documentation, but I didn't find a similar example.

I'm trying to create multiple hdf5 datasets with Python, but it turns out after I close the file data will be overwritten.

Let's say I do the following:

import numpy as np
import h5py
f = h5py.File('test.hdf5', 'w')
f.create_dataset('data1', data = np.ones(10))
f.close()
f = h5py.File('test.hdf5', 'w')
f.create_dataset('data0', data = np.zeros(10))
f.close()
f = h5py.File('test.hdf5', 'r')
f["data1"].value
f.close()

I get

KeyError: "Unable to open object (Object 'data1' doesn't exist)"

If I append data, that requires first opening in 'w' mode and then appending in 'a' mode, having two different statements.

import numpy as np
import h5py
f = h5py.File('test.hdf5', 'w')
f.create_dataset('data1', data = np.ones(10))
f.close()
f = h5py.File('test.hdf5', 'a')
f.create_dataset('data0', data = np.zeros(10))
f.close()
f = h5py.File('test.hdf5', 'r')
f["data1"].value
f.close()

If I open the file in 'a' mode in both cases:

import numpy as np
import h5py
f = h5py.File('test.hdf5', 'a')
f.create_dataset('data1', data = np.ones(10))
f.close()
f = h5py.File('test.hdf5', 'a')
f.create_dataset('data0', data = np.zeros(10))
f.close()
f = h5py.File('test.hdf5', 'r')
print(f['data1'].value)
f.close()

RuntimeError: Unable to create link (Name already exists)

According to the documentation, data should be stored contiguously, but I didn't find how to avoid overwriting data.

How can I store data on a previously closed hdf5 only using one single statement?

Answer

Anand S Kumar picture Anand S Kumar · Aug 9, 2015

If you want to create a unique file in each run, then you should consider naming the file like that , an example would be to add the timestamp to the name of the file, A very simply example would be to use datetime module and now and strftime method to create the file name. Example -

import datetime
filename = "test_{}.hdf5".format(datetime.datetime.now().strftime("%Y_%m_%d_%H_%M_%S"))

Then you can use that filename to open the file.


Demo -

>>> import datetime
>>> filename = "test_{}.hdf5".format(datetime.datetime.now().strftime("%Y_%m_%d_%H_%M_%S"))
>>> filename
'test_2015_08_09_13_33_43.hdf5'