i have a large text file (~7 GB). I am looking if exist the fastest way to read large text file. I have been reading about using several approach as read chunk-by-chunk in order to speed the process.
at example effbot suggest
# File: readline-example-3.py
file = open("sample.txt")
while 1:
lines = file.readlines(100000)
if not lines:
break
for line in lines:
pass # do something**strong text**
in order to process 96,900 lines of text per second. Other authors suggest to use islice()
from itertools import islice
with open(...) as f:
while True:
next_n_lines = list(islice(f, n))
if not next_n_lines:
break
# process next_n_lines
list(islice(f, n))
will return a list of the next n
lines of the file f
. Using this inside a loop will give you the file in chunks of n
lines
with open(<FILE>) as FileObj:
for lines in FileObj:
print lines # or do some other thing with the line...
will read one line at the time to memory, and close the file when done...