I use the Python package h5py (version 2.5.0) to access my hdf5 files.
I want to traverse the content of a file and do something with every dataset.
Using the visit
method:
import h5py
def print_it(name):
dset = f[name]
print(dset)
print(type(dset))
with h5py.File('test.hdf5', 'r') as f:
f.visit(print_it)
for a test file I obtain:
<HDF5 group "/x" (1 members)>
<class 'h5py._hl.group.Group'>
<HDF5 dataset "y": shape (100, 100, 100), type "<f8">
<class 'h5py._hl.dataset.Dataset'>
which tells me that there is a dataset and a group in the file. However there is no obvious way except for using type()
to differentiate between the datasets and the groups. The h5py documentation unfortunately does not say anything about this topic. They always assume that you know beforehand what are the groups and what are the datasets, for example because they created the datasets themselves.
I would like to have something like:
f = h5py.File(..)
for key in f.keys():
x = f[key]
print(x.is_group(), x.is_dataset()) # does not exist
How can I differentiate between groups and datasets when reading an unknown hdf5 file in Python with h5py? How can I get a list of all datasets, of all groups, of all links?
Unfortunately, there is no builtin way in the h5py api to check this, but you can simply check the type of the item with is_dataset = isinstance(item, h5py.Dataset)
.
To list all the content of the file (except the file's attributes though) you can use Group.visititems
with a callable which takes the name and instance of a item.