Seek to line number in text file using C

darksky picture darksky · Nov 15, 2011 · Viewed 16.2k times · Source

I have an ASCII binary file which looks something like:

00010110001001000110011001000111
01011000011100001010100001001000
11110001011010000010010101111010
00000000000000000000000000000000
01011010101000010001010101110000

Each line has 32 characters (so it is of length 33 with \n). I am trying to seek my file pointer to the line that comes right after the 0x0 line (the 4th line in the above example).

What I did was as follows. First, I counted how many lines are in the file. So 5 in this case. I also kept an index at the line which holds the 0x0 line. So 4 in this case. I multiplied 4 by 33 which returns which character the first number after the 0x0 is (I have to add a 1 since this actually returns the \n at the end of the 0x0 line.

After that, I just used fseek. However, it is not working. What is wrong here? Here is my code:

int bytes = 33 * c;
fseek(fp, bytes+1, SEEK_SET);
char test[34];
printf("HERE: '%s'", fgets(test, 34, fp));

Thanks!

Answer

paxdiablo picture paxdiablo · Nov 15, 2011

No, you don't have to add one at all. The offset of the first character in the file is 0.

The offset if the first character on the second line is 33 (assuming your line-ending really is a newline, not a CR/LF combo).

First character on third line is at offset 66.

So your code should actually be:

int bytes = 33 * c;
fseek (fp, bytes, SEEK_SET);  // no "+1" here.
char test[34];
printf ("HERE: '%s'", fgets(test, 34, fp));

Here's a transcript showing that in action:

pax$ cat qq.in
00010110001001000110011001000111
01011000011100001010100001001000
11110001011010000010010101111010
00000000000000000000000000000000
11110000111100001111000011110000

pax$ cat qq.c
#include <stdio.h>

int main (void) {
    char test[34];
    int c = 4;
    FILE *fp = fopen ("qq.in", "r");

    int bytes = 33 * c;
    fseek (fp, bytes, SEEK_SET);
    printf("HERE: %s", fgets(test, 34, fp));

    fclose (fp);
    return 0;
}

pax$ gcc -o qq qq.c ; ./qq
HERE: 11110000111100001111000011110000

Try that code in your environment and see what happens. If you don't get the right data then there's a mismatch between your code and your input file of some sort.

You haven't specified what platform you're on so it may be that you actually have \r\n at the end of the lines rather than just \n. You may also be opening it in the wrong mode (though that usually only matters on Windows).

Doing a dump on the file to validate its contents is a good idea. For example, in a UNIXy system:

pax$ od -xcb qq.in

0000000    3030    3130    3130    3031    3030    3031    3130    3030
          0   0   0   1   0   1   1   0   0   0   1   0   0   1   0   0
        060 060 060 061 060 061 061 060 060 060 061 060 060 061 060 060
0000020    3130    3031    3130    3031    3130    3030    3130    3131
          0   1   1   0   0   1   1   0   0   1   0   0   0   1   1   1
        060 061 061 060 060 061 061 060 060 061 060 060 060 061 061 061
0000040    300a    3031    3131    3030    3030    3131    3031    3030
         \n   0   1   0   1   1   0   0   0   0   1   1   1   0   0   0
        012 060 061 060 061 061 060 060 060 060 061 061 061 060 060 060
:
<< Unnecessary Detail Removed >>
:
0000240    3030    3030    000a
          0   0   0   0  \n
        060 060 060 060 012
0000245

In addition, you may want to print out the values of c and bytes before using them. The fgets function will only return NULL if there's an error or you reach EOF before any data is read.

So, if you're getting NULL as the return value, either you've seeked beyond the end of the file (likely) or you've encountered an error (somewhat less likely but not impossible).