I am making a huffman encoder and to do so i need to read over the input (which will ALWAYS be a redirected file) to record the frequencies, then create the codebook and then read over the input again so i can encode it.
My problem is that i am currently trying to test out how to make the file read over from cin twice.
I read online that cin.seekg(0) or cin.seekg(ios::beg) or cin.seekg(0, ios::beg) all should work perfectly fine so long as the file is redirected and not piped. But when i do that it seems to not do anything at all to the position of cin.
Here is the code that i am currently using:
#include<iostream>
#include"huffmanNode.h"
using namespace std;
int main(){
//create array that stores each character and it's frequency
unsigned int frequencies[255];
//initialize to zero
for(int i=0; i<255; i++){
frequencies[i] = 0;
}
//get input and increment the frequency of corresponding character
char c;
while(!cin.eof()){
cin.get(c);
frequencies[c]++;
}
//create initial leafe nodes for all characters that have appeared at least once
for(int i=0; i<255; i++){
if(frequencies[i] != 0){
huffmanNode* tempNode = new huffmanNode(i, frequencies[i]);
}
}
// test readout of the frequency list
for(int i=0; i<255; i++){
cout << "Character: " << (char)i << " Frequency: " << frequencies[i] << endl;;
}
//go back to beginning of input
cin.seekg(ios::beg);
//read over input again, incrementing frequencies. Should result in double the amount of frequencies
**THIS IS WHERE IT LOOPS FOREVER**
while(!cin.eof()){
cin.get(c);
frequencies[c]++;
}
//another test readout of the frequency list
for(int i=0; i<255; i++){
cout << "Character: " << (char)i << " Double Frequency: " << frequencies[i] << endl;
}
return 0;
}
Debugging shows that it gets stuck in the while loop on line 40, and it seems to constantly be getting a newline character. Why would it not exit this loop? I assume that cin.seekg() is not actually resetting the input.
There are several problems with your code. The first is that you use
the results of an input (cin.get( c )
) without checking that the
input has succeeded. This is always an error; in your case, it will
probably only result in counting (and later outputting) the last
character twice, but it can result in undefined behavior. You must
check that the input stream is in a good state after each input, before
using the value input. The usual way of doint this is:
while ( cin.get( c ) ) // ...
, putting the input directly in the loop condition.
The second is the statement:
cin.seekg( std::ios::beg );
I'm actually sort of surprised that this even compiled: there are two
overloads of seekg
:
std::istream::seekg( std::streampos );
and
std::istream::seekg( std::streamoff, std::ios_base::seekdir );
std::ios::beg
has type std::ios_base::seekdir
. It's possible for an
impementation to define std::streampos
and std::ios_base::seekdir
in
a way so that there is an implicit conversion from
std::ios_base::seekdir
to std::streampos
, but in my opinion, it
shouldn't, since the results will almost certainly not be what you want.
To seek to the beginning of a file:
std::cin.seekg( 0, std::ios_base::beg );
A third problem: errors in the input stream are sticky. Once you've
reached the end of file, that error will remain, and all other
operations will be no-ops, until you have cleared the error:
std::cin.clear();
.
One final comment: the fact that you are using std::cin
worries me.
It will probably work (although there is no guarantee that you can seek
on std::cin
, even if the input is redirected from a file), but do be
aware that there is no way you can output the results of a huffman
encoding to std::cout
. It will work under Unix, but probably no where
else. Huffman encoding requires that the files be open in binary mode,
which is never the case for std::cin
and std::cout
.