I want to do linear regression with the lm
function. My dependent variable is a factor called AccountStatus
:
1:0 days in arrears, 2:30-60 days in arrears, 3:60-90 days in arrears and 4:90+ days in arrears. (4)
As independent variable I have several numeric variables: Loan to value
, debt to income
and interest rate
.
Is it possible to do a linear regression with these variables? I looked on the internet and found something about dummy's, but those were all for the independent variable.
This did not work:
fit <- lm(factor(AccountStatus) ~ OriginalLoanToValue, data=mydata)
summary(fit)
Linear regression does not take categorical variables for the dependent part, it has to be continuous. Considering that your AccountStatus variable has only four levels, it is unfeasible to treat it is continuous. Before commencing any statistical analysis, one should be aware of the measurement levels of one's variables.
What you can do is use multinomial logistic regression, see here for instance. Alternatively, you can recode the AccountStatus as dichotomous and use simple logistic regression.
Sorry to disappoint you, but this is just an inherent restriction of multiple regression, nothing to do with R really. If you want to learn more about which statistical technique is appropriate for different combinations of measurement levels of dependent and independent variables, I can wholeheartedly advise this book.