In my application I'm trying to merge sorted files (keeping them sorted of course), so I have to iterate through each element in both files to write the minimal to the third one. This works pretty much slow on big files, as far as I don't see any other choice (the iteration has to be done) I'm trying to optimize file loading. I can use some amount of RAM, which I can use for buffering. I mean instead of reading 4 bytes from both files every time I can read once something like 100Mb and work with that buffer after that, until there will be no element in buffer, then I'll refill the buffer again. But I guess ifstream is already doing that, will it give me more performance and is there any reason? If fstream does, maybe I can change size of that buffer?
added
My current code looks like that (pseudocode)
// this is done in loop
int i1 = input1.read_integer();
int i2 = input2.read_integer();
if (!input1.eof() && !input2.eof())
{
if (i1 < i2)
{
output.write(i1);
input2.seek_back(sizeof(int));
} else
input1.seek_back(sizeof(int));
output.write(i2);
}
} else {
if (input1.eof())
output.write(i2);
else if (input2.eof())
output.write(i1);
}
What I don't like here is
Can you suggest improvement for that?
Thanks.
Without getting into the discussion on stream buffers, you can get rid of the seek_back
and generally make the code much simpler by doing:
using namespace std;
merge(istream_iterator<int>(file1), istream_iterator<int>(),
istream_iterator<int>(file2), istream_iterator<int>(),
ostream_iterator<int>(cout));
Added binary capability
#include <algorithm>
#include <iterator>
#include <fstream>
#include <iostream>
struct BinInt
{
int value;
operator int() const { return value; }
friend std::istream& operator>>(std::istream& stream, BinInt& data)
{
return stream.read(reinterpret_cast<char*>(&data.value),sizeof(int));
}
};
int main()
{
std::ifstream file1("f1.txt");
std::ifstream file2("f2.txt");
std::merge(std::istream_iterator<BinInt>(file1), std::istream_iterator<BinInt>(),
std::istream_iterator<BinInt>(file2), std::istream_iterator<BinInt>(),
std::ostream_iterator<int>(std::cout));
}