short formula call for many variables when building a model

iinception picture iinception · Apr 25, 2011 · Viewed 48.3k times · Source

I am trying to build a regression model with lm(...). My dataset has lots of features(>50). I do not want to write my code as lm(output~feature1+feature2+feature3+...+feature70). I was wondering what is the short hand notation to write this code.

Answer

Chase picture Chase · Apr 25, 2011

You can use . as described in the help page for formula. The . stands for "all columns not otherwise in the formula".

lm(output ~ ., data = myData).

Alternatively, construct the formula manually with paste. This example is from the as.formula() help page:

xnam <- paste("x", 1:25, sep="")
(fmla <- as.formula(paste("y ~ ", paste(xnam, collapse= "+"))))

You can then insert this object into regression function: lm(fmla, data = myData).