predict() with arbitrary coefficients in r

Stencil picture Stencil · Sep 6, 2014 · Viewed 9.9k times · Source

I've got some coefficients for a logit model set by a non-r user. I'd like to import those coefficients into r and generate some goodness of fit estimates on the same dataset (ROC and confusion matrix) vs my own model. My first thought was to coerce the coefficients into an existing GLM object using something like

summary(fit)$coefficients[,1] <- y

or

summary(fit)$coefficients <- x

where y and x are matrices containing the coefficients I'm trying to use to predict and fit is a previously created dummy glm object fit to the dataset. Of course, this gives me only errors.

Is there any way to pass an arbitrary coefficient vector to the predict() function or to specify coefficients in a model? Can I somehow force this by passing a vector into the offset argument in GLM? Thanks

Edit: As mentioned in the comments, there's not much statistical basis for using the arbitrary coefficients. I have a business partner who believes he/she 'knows' the right coefficients and I'm trying to quantify the loss of predictive power based on those estimates versus the coefficients generated by a proper model.

Edit2: Per BondedDust's answer, I was able to coerce the coefficients, however wasn't able to clear the error messages that predict() returned due to the coercion, it would appear that predict.lm, which is called by predict, also looks at the rank of the coefficients and that is causing the error.

Answer

user20650 picture user20650 · Sep 8, 2014

This is not an answer to your posted question - which BondedDust answered - but describes an alternate way in calculating the predicted probabilities yourself which might help in this case.

# Use the mtcars dataset for a minimum worked example
data(mtcars)

# Run a logistic regression and get predictions 
mod <- glm(vs ~ mpg + factor(gear) + factor(am), mtcars, family="binomial")
p1 <- predict(mod, type="response")

# Calculate predicted probabilities manually
m <- model.matrix(~ mpg + factor(gear) + factor(am), mtcars)[,]
p2 <- coef(mod) %*% t(m)
p2 <- plogis(p2)

all(p1 == p2)
#identical(as.numeric(p1), as.numeric(p2))

You can replace coef(mod) with the vector of coefficients given to you. model.matrix will generate the dummy variables required for the calculation - check that the ordering is the same as that of the coefficient vector.