How do I deal with NAs in residuals in a regression in R?

c00kiemonster picture c00kiemonster · Jul 30, 2011 · Viewed 11.3k times · Source

So I am having some issues with some NA values in the residuals of a lm cross sectional regression in R.

The issue isn't the NA values themselves, it's the way R presents them.

For example:

test$residuals
#          1          2          4          5 
#  0.2757677 -0.5772193 -5.3061303  4.5102816 
test$residuals[3]
#        4 
# -5.30613 

In this simple example a NA value will make one of the residuals go missing. When I extract the residuals I can clearly see the third index missing. So far so good, no complaints here. The problem is that the corresponding numeric vector is now one item shorter so the third index is actually the fourth. How can I make R return these residuals instead, i.e., explicitly showing NA instead of skipping an index?

test$residuals
#          1          2          3          4          5 
#  0.2757677 -0.5772193         NA -5.3061303  4.5102816

I need to keep track of all individual residuals so it would make my life much easier if I could extract them this way instead.

Answer

c00kiemonster picture c00kiemonster · Jul 30, 2011

I just found this googling around a bit deeper. The resid function on a lm with na.action=na.exclude is the way to go.