std::fstream buffering vs manual buffering (why 10x gain with manual buffering)?

Vincent picture Vincent · Oct 21, 2012 · Viewed 24.7k times · Source

I have tested two writing configurations:

  1. Fstream buffering:

    // Initialization
    const unsigned int length = 8192;
    char buffer[length];
    std::ofstream stream;
    stream.rdbuf()->pubsetbuf(buffer, length);
    stream.open("test.dat", std::ios::binary | std::ios::trunc)
    
    // To write I use :
    stream.write(reinterpret_cast<char*>(&x), sizeof(x));
    
  2. Manual buffering:

    // Initialization
    const unsigned int length = 8192;
    char buffer[length];
    std::ofstream stream("test.dat", std::ios::binary | std::ios::trunc);
    
    // Then I put manually the data in the buffer
    
    // To write I use :
    stream.write(buffer, length);
    

I expected the same result...

But my manual buffering improve performance by a factor of 10 to write a file of 100MB, and the fstream buffering does not change anything compared to the normal situation (without redefining a buffer).

Does someone has an explanation of this situation ?

EDIT : Here are the news : a benchmark just done on a supercomputer (linux 64-bit architecture, lasts intel Xeon 8-core, Lustre filesystem and ... hopefully well configured compilers) benchmark (and I don't explain the reason of the "resonance" for a 1kB manual buffer...)

EDIT 2 : And the resonance at 1024 B (if someone has an idea about that, I'm interested) : enter image description here

Answer

Vaughn Cato picture Vaughn Cato · Oct 21, 2012

This is basically due to function call overhead and indirection. The ofstream::write() method is inherited from ostream. That function is not inlined in libstdc++, which is the first source of overhead. Then ostream::write() has to call rdbuf()->sputn() to do the actual writing, which is a virtual function call.

On top of that, libstdc++ redirects sputn() to another virtual function xsputn() which adds another virtual function call.

If you put the characters into the buffer yourself, you can avoid that overhead.