Should I use `readinto` method of python file or not?

Syrtis Major picture Syrtis Major · Jan 13, 2016 · Viewed 10.5k times · Source

I've recently meet the readinto method of file object (in Python 2.7), it is similar to fread in C. It seems to be convenient and powerful in some case. I plan to use it to read several files into one pre-allocated numpy array without data copied.

e.g.

a = np.empty(N)
b = memoryview(a)
fp1.readinto(b[0:100])
fp2.readinto(b[100:200])

and

fp1.readinto(b[0:100])
fp1.seek(400, 1)
fp1.readinto(b[100:200])

I've used Cython and fread to do this before I met readinto. So I'm very happy to know a pure python solution.

However its document string says,

file.readinto?
Type:        method_descriptor
String form: <method 'readinto' of 'file' objects>
Namespace:   Python builtin
Docstring:   readinto() -> Undocumented.  Don't use this; it may go away.

Don't use this? What happend?

So I'm confused, should I use readinto or not? It may cause any unwanted problem?

Is there any alternative implementation for the code above without readinto but also avoid data copy? (To avoid copy means np.concatenate or np.stack is not a good choice.)

Any sugguestion is welcome! Thank you.

-------upate-------

It seems that I can use io.FileIO in standard library instead of the build-in function open. It looks OK so I've post it as an answer.

Any comment or other solution is still welcome!

-------upate-------

If you meet the same problem, you may want to have a look at the comments below by
Andrea Corbellini and Padraic Cunningham.

Answer

Syrtis Major picture Syrtis Major · Jan 13, 2016

You may use io.FileIO in python standard library instead of the build-in function open or file, if you are not sure with file.readinto.

Here's the docstring:

#io.FileIO.readinto?
Type:        method_descriptor
String form: <method 'readinto' of '_io.FileIO' objects>
Docstring:   readinto() -> Same as RawIOBase.readinto().

The document of io.RawIOBase.readinto can be found here.

class io.RawIOBase

...

readinto(b)

Read up to len(b) bytes into bytearray b and return the number of bytes read. If the object is in non-blocking mode and no bytes are available, None is returned.

It's available in both Python 2 and 3.