R Apply() function on specific dataframe columns

skmathur picture skmathur · Aug 29, 2013 · Viewed 192.1k times · Source

I want to use the apply function on a dataframe, but only apply the function to the last 5 columns.

B<- by(wifi,(wifi$Room),FUN=function(y){apply(y, 2, A)})

This applies A to all the columns of y

B<- by(wifi,(wifi$Room),FUN=function(y){apply(y[4:9], 2, A)})

This applies A only to columns 4-9 of y, but the total return of B strips off the first 3 columns... I still want those, I just don't want A applied to them.

wifi[,1:3]+B 

also does not do what I expected/wanted.

Answer

leif picture leif · Aug 29, 2013

lapply is probably a better choice than apply here, as apply first coerces your data.frame to an array which means all the columns must have the same type. Depending on your context, this could have unintended consequences.

The pattern is:

df[cols] <- lapply(df[cols], FUN)

The 'cols' vector can be variable names or indices. I prefer to use names whenever possible (it's robust to column reordering). So in your case this might be:

wifi[4:9] <- lapply(wifi[4:9], A)

An example of using column names:

wifi <- data.frame(A=1:4, B=runif(4), C=5:8)
wifi[c("B", "C")] <- lapply(wifi[c("B", "C")], function(x) -1 * x)