Loop function and add columns to dataframe in R

user2351480 picture user2351480 · Jun 12, 2014 · Viewed 17.4k times · Source

I want to loop a function over dataframes and add information from the function to the data frame. I have read the many relevant posts on looping and applying functions to dataframes that have got me close to what I need, but not quite there, so Im hoping people can help me.

I have a number of dataframes that look like this

dat1=as.data.frame(matrix(rnorm(25),ncol=5))
dat2=as.data.frame(matrix(rnorm(25),ncol=5))
dat3=as.data.frame(matrix(rnorm(25),ncol=5))

I want to calculate the mean of some of these columns and add on to the end of the dataframe. So, I wrote a function to calculate means from a dataframe:

my_fun <-  function(dataframe){
rowMeans( dataframe[ , c("V1","V2")],na.rm=TRUE) 
}

To apply this function to one dataframe, I do this:

 dat1$V6<- my_fun(dat1)

But I want to be able to loop through all dataframes and add this mean column on to the end of each df.

After reading some helpful posts, I created a list and used sapply:

dfList<-list(dat1,dat2,dat3)#create list
sapply(dfList, my_fun) #apply function to list

Which gives me the values that I want, but I dont want them in a separate output- I'd like them to simply add a column on to the original dataframe- as happens when I apply the function to the individual dataframes. Can anyone tell me how to do this? Or point me to a post that describes how to do this (I have searched high and low, but maybe typing the wrong keywords) Im sure its very straightforward if you know how!

Answer

Dirk picture Dirk · Jun 12, 2014

It's soo much easier and at least 100x faster if you use data.table

require(data.table)
set.seed(612)
dat1=as.data.table(matrix(rnorm(25),ncol=5))
dat2=as.data.table(matrix(rnorm(25),ncol=5))
dat3=as.data.table(matrix(rnorm(25),ncol=5))

dtList <- list(dat1, dat2, dat3)

for (dat in dtList) {
     dat[,V6:=(V1 + V2)/2]
}

This gives the following output:

> dtList

[[1]]
           V1         V2         V3         V4          V5          V6
1:  0.3903228 -1.1581608  1.0171311  0.3866628  0.02756137 -0.38391897
2: -0.6030124  0.4713771 -2.4204376 -0.2843527  0.53463600 -0.06581764
3: -0.9850333  0.3343518 -1.2329712 -1.1767533  0.56714483 -0.32534080
4: -0.1591335 -0.6729444  0.5062648 -0.3001857 -0.84896068 -0.41603897
5:  1.7127203  0.3149884  1.7633945  1.7824786 -0.90316850  1.01385434

[[2]]
            V1         V2         V3          V4         V5         V6
1: -1.22790810  0.8429506  0.4921844 -0.29686607 -0.9501956 -0.1924788
2:  0.09405923 -1.6970403  0.1280003  1.22284944  0.8667643 -0.8014905
3:  0.55298783 -0.1081849  0.4120268 -0.56411756  1.9135802  0.2224015
4: -0.82621808  0.4753731  0.4755664 -0.05885804  0.9658787 -0.1754225
5:  0.44262554  0.3036363 -1.7404580  0.88870595  1.4826431  0.3731309

[[3]]
            V1          V2         V3        V4          V5          V6
1:  0.82085834  0.07221027  1.8835042 0.2563714  0.27891033  0.44653430
2:  0.00445113  1.89450534  0.3878858 1.8385587 -1.86381524  0.94947824
3:  0.66458950 -1.31023362 -0.9403257 1.2128128  0.74922668 -0.32282206
4: -1.40169143 -1.52925147  0.8232823 0.3391147  0.33463875 -1.46547145
5:  1.10566340 -1.16512217  0.3859652 0.8123110  0.04712086 -0.02972939