Read lines by number from a large file

Aleksandr Levchuk picture Aleksandr Levchuk · Aug 23, 2011 · Viewed 28.5k times · Source

I have a file with 15 million lines (will not fit in memory). I also have a small vector of line numbers - the lines that I want to extract.

How can I read-out the lines in one pass?

I was hoping for a C function that does it on one pass.

Answer

mbq picture mbq · Aug 23, 2011

The trick is to use connection AND open it before read.table:

con<-file('filename')
open(con)

read.table(con,skip=5,nrow=1) #6-th line
read.table(con,skip=20,nrow=1) #27-th line
...
close(con)

You may also try scan, it is faster and gives more control.