Regression of dummy variables in R

Trgovec picture Trgovec · Aug 12, 2017 · Viewed 9k times · Source

I am new to R and I am trying to performa regression on my dataset, which includes e.g. monthly sales data of a company in different countries over multiple years.

In other statistical programs, in order to control for quarterly cyclical movement of sales as well as for the regional (country) differences, I would create dummy variables indicating e.g. quarters and countries where sales are made.

My questions:

1) I saw that in R you can set a variable type to 'Factor'. Do I in this case still need to create dummy variables indicating countries and months/quarters, or is R already treating the factor variables differently and is automatically converting them to dummies in the background?

2) If the above is not the case, and I indeed need to recode my values into 0,1 dummies, is there a neat standard way in R to do it?

Thanks a lot for your help and have a nice day!

Trgovec

Answer

Oriol Mirosa picture Oriol Mirosa · Aug 12, 2017

Yes, R automatically treats factor variables as reference dummies, so there's nothing else you need to do and, if you run your regression, you should see the typical output for dummy variables for those factors.

Notice, however, that there are several ways of coding categorical variables, so you might want to do something different using the C function. You can find good details here. Also, there are packages devoted to help you in the creation of dummy variables if you need more control, such as the dummies package.