I've been using the following bit of code to open some HDF5 files, produced in MATLAB, in python using H5PY:
import h5py as h5
data='dataset.mat'
f=h5.File(data, 'r')
However I'm getting the following error:
OSError: Unable to open file (File signature not found)
I've checked that the files that I'm trying to open are version 7.3 MAT-files and are HDF5 format. In fact I've used H5PY to open the same files successfully before. I've confirmed that the files exist and are accessible so I'm not really sure where the error is coming from. Any advice would be greatly appreciated, thanks in advance : )
Usually the message File signature not found
indicates either:
1. Your file is corrupted.
... is what I think is most likely. You said you've opened the files before. Maybe you forgot closing your file-handle which can corrupt the file.
Try checking the file with the HDF5 utility h5debug
(available on command line if you've installed the hdf5 lib on your OS, check with dpkg -s libhdf5-dev
on Linux).
2. The file is not in HDF5 format.
This is a known cause for your error message. But since you said you made sure, that this is the case and you've opened the files before, I'm giving this just for reference for others that may stumble here:
Since December 2015 (as of version 7.3), Matlab files use the HDF5 based format in their MAT-File Level 5 Containers (more doc). Earlier version MAT-files (v4 (Level 1.0), v6 and v7 to 7.2) are supported by and can be read with the scipy
library:
import scipy.io
f = scipy.io.loadmat('dataset.mat')
Otherwise you may try other methods and see whether the error persists:
PyTables is an alternative to h5py and be found here.
import tables
file = tables.open_file('test.mat')
Install using
pip install tables
Python MATLAB Engine is an alternative to read MAT files, if you have matlab installed. Documentation is found here: MATLAB Engine API for Python.
import matlab.engine
mat = matlab.engine.start_matlab()
f = mat.load("dataset.mat", nargout=1)