I've recently meet the readinto
method of file
object (in Python 2.7), it is similar to fread
in C. It seems to be convenient and powerful in some case. I plan to use it to read several files into one pre-allocated numpy array without data copied.
e.g.
a = np.empty(N)
b = memoryview(a)
fp1.readinto(b[0:100])
fp2.readinto(b[100:200])
and
fp1.readinto(b[0:100])
fp1.seek(400, 1)
fp1.readinto(b[100:200])
I've used Cython
and fread
to do this before I met readinto
. So I'm very happy to know a pure python solution.
However its document string says,
file.readinto?
Type: method_descriptor
String form: <method 'readinto' of 'file' objects>
Namespace: Python builtin
Docstring: readinto() -> Undocumented. Don't use this; it may go away.
Don't use this? What happend?
So I'm confused, should I use readinto
or not? It may cause any unwanted problem?
Is there any alternative implementation for the code above without readinto
but also avoid data copy? (To avoid copy means np.concatenate
or np.stack
is not a good choice.)
Any sugguestion is welcome! Thank you.
-------upate-------
It seems that I can use io.FileIO
in standard library instead of the build-in function open
. It looks OK so I've post it as an answer.
Any comment or other solution is still welcome!
-------upate-------
If you meet the same problem, you may want to have a look at the comments below by
Andrea Corbellini and Padraic Cunningham.
You may use io.FileIO
in python standard library instead of the build-in function open
or file
, if you are not sure with file.readinto
.
Here's the docstring:
#io.FileIO.readinto?
Type: method_descriptor
String form: <method 'readinto' of '_io.FileIO' objects>
Docstring: readinto() -> Same as RawIOBase.readinto().
The document of io.RawIOBase.readinto
can be found here.
class io.RawIOBase
...
readinto(b)
Read up to len(b) bytes into bytearray b and return the number of bytes read. If the object is in non-blocking mode and no bytes are available, None is returned.
It's available in both Python 2 and 3.