Using R to download newest files from ftp-server

Alexander picture Alexander · Mar 6, 2014 · Viewed 27.4k times · Source

I have a a number of files named

FileA2014-03-05-10-24-12
FileB2014-03-06-10-25-12

Where the part "2014-03-05-10-24-12" means "Year/Day/Month/Hours/Minutes/Seconds/". These files reside on a ftp-server. I would like to use R to connect to the ftp-server and download whatever file is newest based on date.

I have started trying to list the content, using RCurl and dirlistonly. Next step will be to try to parse and find the newest file. Not quite there yet...

library(RCurl)
getURL("ftpserver/",verbose=TRUE,dirlistonly = TRUE) 

Answer

Rentrop picture Rentrop · Mar 6, 2014

This should work

library(RCurl)
url <- "ftp://yourServer"
userpwd <- "yourUser:yourPass"
filenames <- getURL(url, userpwd = userpwd,
             ftp.use.epsv = FALSE,dirlistonly = TRUE) 

-

times<-lapply(strsplit(filenames,"[-.]"),function(x){
  time<-paste(c(substr(x[1], nchar(x[1])-3, nchar(x[1])),x[2:6]),
        collapse="-")
  time<-as.POSIXct(time, "%Y-%m-%d-%H-%M-%S", tz="GMT")
})
ind <- which.max(times)
dat <- try(getURL(paste(url,filenames[ind],sep=""), userpwd = userpwd))

So datis now containing the newest file

To make it reproduceable: all others can use this instead of the upper part use

filenames<-c("FileA2014-03-05-10-24-12.csv","FileB2014-03-06-10-25-12.csv")