fread() a struct in c

user153882 picture user153882 · Nov 6, 2015 · Viewed 15k times · Source

For my assignment, I'm required to use fread/fwrite. I wrote

#include <stdio.h>
#include <string.h>

struct rec{
    int account;
    char name[100];
    double balance;
};

int main()
{
    struct rec rec1;
    int c;

    FILE *fptr;
    fptr = fopen("clients.txt", "r");

    if (fptr == NULL)
        printf("File could not be opened, exiting program.\n");
    else
    {
        printf("%-10s%-13s%s\n", "Account", "Name", "Balance");
        while (!feof(fptr))
        {
            //fscanf(fptr, "%d%s%lf", &rec.account, rec.name, &rec.balance);
            fread(&rec1, sizeof(rec1),1, fptr);
            printf("%d %s %f\n", rec1.account, rec1.name, rec1.balance);
        }
        fclose(fptr);
    }
    return 0;
}

clients.txt file

100 Jones 564.90
200 Rita 54.23
300 Richard -45.00

output

Account   Name         Balance
540028977 Jones 564.90
200 Rita 54.23
300 Richard -45.00╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
╠╠ü☻§9x°é -92559631349317831000000000000000000000000000000000000000000000.000000

Press any key to continue . . .

I can do this with fscanf (which Ive commented out), but I'm required to use fread/fwrite.

  1. Why does it start with a massive number for Jone's account?
  2. Why is there garbage after? Shouldn't feof stop this?
  3. Are there any drawbacks using this method? or fscanf method?

How can I fix these? Many thanks in advance

Answer

David Munro picture David Munro · Nov 11, 2015

As the comments say, fread reads the bytes in your file without any interpretation. The file clients.txt consists of 50 characters, 16 in the first line plus 14 in the second plus 18 in the third line, plus two newline characters. (Your clients.txt does not contain a newline after the third line, as you will soon see.) The newline character is a single byte \n on UNIX or Mac OS X machines, but (probably) two bytes \r\n on Windows machines - hence either 50 or 51 characters. Here is the sequence of ASCII bytes in hexadecimal:

3130 3020 4a6f 6e65 7320 3536 342e 3930     100 Jones 564.90
0a32 3030 2052 6974 6120 3534 2e32 330a     \n200 Rita 54.23\n
3330 3020 5269 6368 6172 6420 2d34 352e     300 Richard -45.
3030                                        00

Your fread statement copies these bytes without any interpretation directly into your rec1 data structure. That structure begins with int account;, which says to interpret the first four bytes as an int. As one of the comments noted, you are running your program on a little-endian machine (most likely an Intel machine), so the least significant byte is the first and the most significant byte is the fourth. Thus, your fread said to interpret the sequence of four ASCII characters "100 " as the four byte integer 0x20303031, which equals, in decimal, 540028977. The next member of your struct is char name[100];, which means that the next 100 bytes of data in rec1 will be the name. But the fread was told to read sizeof(rec1)=112 bytes (4 byte account, 100 byte name, 8 byte balance). Since your file is only 50 (or 52) characters, fread will have only been able to fill in that many bytes of rec1. The return value of fread, had you not discarded it, would have told you that the read stopped short of the number of bytes you requested. Since you hit EOF, the feof call breaks out of the loop after that first pass, having consumed the entire file in one gulp.

All of your output was produced by the first and only call to fprintf. The number 540028977 and the following space were produced by the "%d " and the rec1.account argument. The next bit is only partly determinate, and you got lucky: The "%s" specifier and the corresponding rec1.name argument will print the next characters as ASCII until a \0 byte is found. Thus, the output will begin with the 50-4 (or 52-4) remaining characters of your file -- including the two newlines -- and potentially continue forever, because there are no \0 bytes in your file (or in any text file), which means that after printing the last character of your file, what you are seeing is whatever garbage happened to be in the automatic variable rec1 when your program started. (That kind of unintentional output is similar to the famous heartbleed bug in OpenSSL.) You were lucky the garbage included a \0 byte after only a few dozen more characters. Note that printf has no way to know that rec1.name was declared to be only a 100 byte array -- it only got the pointer to the beginning of name -- it was your responsibility to guarantee that rec1.name contained a terminating \0 byte, and you never did that.

We can tell a little bit more. The number -9.2559631349317831e61 (which is pretty ugly in "%f" format) is the value of rec1.balance. The 8 bytes for that double value on an IEEE 754 machine (like your Intel and all modern computers) are in hex 0xcccccccccccccccc. Sixty four of the peculiar symbol appear in the "%s" output corresponding to rec1.name, while only 100-46 = 54 characters remain of the 100, so your "%s" output has run off the end of rec1.name, and includes rec1.balance into the bargain, and we learn that your terminal program interpreted the non-ASCII character 0xcc as . There are many ways to interpret bytes bigger than 127 (0x7f); in latin-1 it would have been &Igrave; for example. The graphical character is the representation of the 0xcc (204) byte in the ancient MS-DOS character set, Windows code page 437. Not only are you running on an Intel machine, it is a Windows machine (of course the mostly likely possibility to begin with).

That answers your first two questions. I'm not sure I understand your third question. The "drawbacks" I hope are obvious.

As for how to fix it, there is no reasonably simple way to read and interpret a text file using fread. To do so, you would need to duplicate much of the code in the libc fscanf function. The only sensible way is to first use fwrite to create a binary file; then fread will work naturally to read it back. So there have to be two programs -- one to write a binary clients.bin file, and a second to read it back. Of course, that does not solve the problem of where the data for that first program should come from in the first place. It could come from reading clients.txt using fscanf. Or it could be included in the source code of the fwrite program, for example by initializing an array of struct rec like this:

struct rec recs[] = {{100, "Jones", 564.90},
                     {200, "Rita", 54.23},
                     {300, "Richard", -45.00}};

Or it could come from reading a MySQL database, or... The one place it is unlikely to originate is in a binary file (easily) readable with fread.