Lately I've been asked to write a function that reads the binary file into the std::vector<BYTE>
where BYTE
is an unsigned char
. Quite quickly I came with something like this:
#include <fstream>
#include <vector>
typedef unsigned char BYTE;
std::vector<BYTE> readFile(const char* filename)
{
// open the file:
std::streampos fileSize;
std::ifstream file(filename, std::ios::binary);
// get its size:
file.seekg(0, std::ios::end);
fileSize = file.tellg();
file.seekg(0, std::ios::beg);
// read the data:
std::vector<BYTE> fileData(fileSize);
file.read((char*) &fileData[0], fileSize);
return fileData;
}
which seems to be unnecessarily complicated and the explicit cast to char*
that I was forced to use while calling file.read
doesn't make me feel any better about it.
Another option is to use std::istreambuf_iterator
:
std::vector<BYTE> readFile(const char* filename)
{
// open the file:
std::ifstream file(filename, std::ios::binary);
// read the data:
return std::vector<BYTE>((std::istreambuf_iterator<char>(file)),
std::istreambuf_iterator<char>());
}
which is pretty simple and short, but still I have to use the std::istreambuf_iterator<char>
even when I'm reading into std::vector<unsigned char>
.
The last option that seems to be perfectly straightforward is to use std::basic_ifstream<BYTE>
, which kinda expresses it explicitly that "I want an input file stream and I want to use it to read BYTE
s":
std::vector<BYTE> readFile(const char* filename)
{
// open the file:
std::basic_ifstream<BYTE> file(filename, std::ios::binary);
// read the data:
return std::vector<BYTE>((std::istreambuf_iterator<BYTE>(file)),
std::istreambuf_iterator<BYTE>());
}
but I'm not sure whether basic_ifstream
is an appropriate choice in this case.
What is the best way of reading a binary file into the vector
? I'd also like to know what's happening "behind the scene" and what are the possible problems I might encounter (apart from stream not being opened properly which might be avoided by simple is_open
check).
Is there any good reason why one would prefer to use std::istreambuf_iterator
here?
(the only advantage that I can see is simplicity)
When testing for performance, I would include a test case for:
std::vector<BYTE> readFile(const char* filename)
{
// open the file:
std::ifstream file(filename, std::ios::binary);
// Stop eating new lines in binary mode!!!
file.unsetf(std::ios::skipws);
// get its size:
std::streampos fileSize;
file.seekg(0, std::ios::end);
fileSize = file.tellg();
file.seekg(0, std::ios::beg);
// reserve capacity
std::vector<BYTE> vec;
vec.reserve(fileSize);
// read the data:
vec.insert(vec.begin(),
std::istream_iterator<BYTE>(file),
std::istream_iterator<BYTE>());
return vec;
}
My thinking is that the constructor of Method 1 touches the elements in the vector
, and then the read
touches each element again.
Method 2 and Method 3 look most promising, but could suffer one or more resize
's. Hence the reason to reserve
before reading or inserting.
I would also test with std::copy
:
...
std::vector<byte> vec;
vec.reserve(fileSize);
std::copy(std::istream_iterator<BYTE>(file),
std::istream_iterator<BYTE>(),
std::back_inserter(vec));
In the end, I think the best solution will avoid operator >>
from istream_iterator
(and all the overhead and goodness from operator >>
trying to interpret binary data). But I don't know what to use that allows you to directly copy the data into the vector.
Finally, my testing with binary data is showing ios::binary
is not being honored. Hence the reason for noskipws
from <iomanip>
.