I am really in trouble: I want to read HUGE files over several GB using FileChannel
s and MappedByteBuffer
s - all the documentation I found implies it's rather simple to map a file using the FileChannel.map()
method.
Of course there is a limit at 2GB as all the Buffer methods use int for position, limit and capacity - but what about the system implied limits below that?
In reality, I get lots of problems regarding OutOfMemoryException
s! And no documentation at all that really defines the limits!
So - how can I map a file that fits into the int-limit safely into one or several MappedByteBuffer
s without just getting exceptions?
Can I ask the system which portion of a file I can safely map before I try FileChannel.map()
? How?
Why is there so little documentation about this feature??
I can offer some working code. Whether this solves your problem or not is difficult to say. This hunts through a file for a pattern recognised by the Hunter
.
See the excellent article Java tip: How to read files quickly for the original research (not mine).
// 4k buffer size.
static final int SIZE = 4 * 1024;
static byte[] buffer = new byte[SIZE];
// Fastest because a FileInputStream has an associated channel.
private static void ScanDataFile(Hunter p, FileInputStream f) throws FileNotFoundException, IOException {
// Use a mapped and buffered stream for best speed.
// See: http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly
FileChannel ch = f.getChannel();
long red = 0L;
do {
long read = Math.min(Integer.MAX_VALUE, ch.size() - red);
MappedByteBuffer mb = ch.map(FileChannel.MapMode.READ_ONLY, red, read);
int nGet;
while (mb.hasRemaining() && p.ok()) {
nGet = Math.min(mb.remaining(), SIZE);
mb.get(buffer, 0, nGet);
for (int i = 0; i < nGet && p.ok(); i++) {
p.check(buffer[i]);
}
}
red += read;
} while (red < ch.size() && p.ok());
// Finish off.
p.close();
ch.close();
f.close();
}