How can I use fread to read gz files in R?

pythOnometrist picture pythOnometrist · Jun 9, 2016 · Viewed 8.4k times · Source

I am on a windows machine trying to speed up the read.table step. My files are all .gz.

x=paste("gzip -c ",filename,sep="")
phi_raw = fread(x)

Error in fread(x) : 

Cannot understand the error . Its a bit too cryptic for me.

Not a duplicate as suggested by zx8754: using specifically in the context of fread. And while fread dows not have native support for gzip, this paradigm should work. See http://www.molpopgen.org/coding/datatable.html

Update

Per suggestion below using system yields a longer error message - though i am still stuck.

Error in fread(system(x)) : 

  'input' must be a single character string containing a file name, a command, full path to a file, a URL starting 'http[s]://', 'ftp[s]://' or 'file://', or the input data itself

In addition: Warning message:


running command 'gzip -c D:/x_.gz' had status 1

Update

Running with gunzip as pointed out below:

Error in fread(system(x)) : 

  'input' must be a single character string containing a file name, a command, full path to a file, a URL starting 'http[s]://', 'ftp[s]://' or 'file://', or the input data itself

In addition: Warning message:

running command 'gunzip -c D:/XX_.gz' had status 127

note the different status

Answer

George Costanza picture George Costanza · Jul 17, 2020

data.table now supports reading .gz files directly with the fread function, provided that the R.utils package is installed.

As suggested in this answer to a similar question, you can simply run the following:

library(data.table)
phi_raw <- fread("filename.gz")