Predict.glm not predicting missing values in response

Question 1

Predict.glm not predicting missing values in response

r prediction missing-data glm lm

generic_user · Apr 28, 2013 · Viewed 14.3k times · Source

Answer

Answer

When glm fits the model, it uses only the cases where there are no missing values. You can still get predictions for the cases where your y values are missing, by constructing a data frame and passing that to predict.glm.

predict(m, newdata=data.frame(y, x))

Question 2

For some reason, when I specify glms (and lm's too, it turns out), R is not predicting missing values of the data. Here is an example:

y = round(runif(50))
y = c(y,rep(NA,50))
x = rnorm(100)
m = glm(y~x, family=binomial(link="logit"))
p = predict(m,na.action=na.pass)
length(p)

y = round(runif(50))
y = c(y,rep(NA,50))
x = rnorm(100)
m = lm(y~x)
p = predict(m)
length(p)

The length of p should be 100, but its 50. The weird thing is that I have other predicts in the same script that do predict from missing data.

EDIT: It turns out that those other predicts were quite wrong -- I was doing imputed.value = rnorm(N,mean.from.predict,var.of.prediction.interval). This recycled the mean and sd vectors from the lm predict or glm predict functions when length(predict)<N, which was quite different from what I was seeking.

So my question is what about my example code is stopping glm and lm from predicting missing values?

Thanks!

Predict.glm not predicting missing values in response

Answer

Related questions