I have to read a big text file of, say, 25 GB and need to process this file within 15-20 minutes. This file will have multiple header and footer section.
I tried CSplit to split this file based on header, but it is taking around 24 to 25 min to split it to a number of files based on header, which is not acceptable at all.
I tried sequential reading and writing by using BufferReader
and BufferWiter
along with FileReader
and FileWriter
. It is taking more than 27 min. Again, it is not acceptable.
I tried another approach like get the start index of each header and then run multiple threads to read file from specific location by using RandomAccessFile
. But no luck on this.
How can I achieve my requirement?
Possible duplicate of:
Try using a large buffer read size (for example, 20MB instead of 2MB) to process your data quicker. Also don't use a BufferedReader because of slow speeds and character conversions.
This question has been asked before: Read large files in Java