Is there a way in Matlab to determine the number of lines in a file without looping through each line?

robguinness picture robguinness · Aug 29, 2012 · Viewed 30.6k times · Source

Obviously one could loop through a file using fgetl or similar function and increment a counter, but is there a way to determine the number of lines in a file without doing such a loop?

Answer

Mehrwolf picture Mehrwolf · Aug 29, 2012

I like to use the following code for exactly this task

fid = fopen('someTextFile.txt', 'rb');
%# Get file size.
fseek(fid, 0, 'eof');
fileSize = ftell(fid);
frewind(fid);
%# Read the whole file.
data = fread(fid, fileSize, 'uint8');
%# Count number of line-feeds and increase by one.
numLines = sum(data == 10) + 1;
fclose(fid);

It is pretty fast if you have enough memory to read the whole file at once. It should work for both Windows- and Linux-style line endings.

Edit: I measured the performance of the answers provided so far. Here is the result for determining the number of lines of a text file containing 1 million double values (one value per line). Average of 10 tries.

 Author           Mean time +- standard deviation (s)
------------------------------------------------------
 Rody Oldenhuis      0.3189 +- 0.0314
 Edric (2)           0.3282 +- 0.0248
 Mehrwolf            0.4075 +- 0.0178
 Jonas               1.0813 +- 0.0665
 Edric (1)          26.8825 +- 0.6790

So fastest are the approaches using Perl and reading all the file as binary data. I would not be surprised, if Perl internally also read large blocks of the file at once instead of looping through it line by line (just a guess, do not know anything about Perl).

Using a simple fgetl()-loop is by a factor of 25-75 slower than the other approaches.

Edit 2: Included Edric's 2nd approach, which is much faster and on-par with the Perl solution, I'd say.