Do imputation in R when mice returns error that "system is computationally singular"

user8270077 picture user8270077 · Jan 20, 2018 · Viewed 9k times · Source

I am trying to do imputation to a medium size dataframe (~100,000 rows) where 5 columns out of 30 have NAs (a large proportion, around 60%).

I tried mice with the following code:

library(mice)    
data_3 = complete(mice(data_2))

After the first iteration I got the following exception:

iter imp variable
  1   1  Existing_EMI  Loan_Amount  Loan_Period

Error in solve.default(xtx + diag(pen)): system is computationally singular: reciprocal condition number = 1.08007e-16

Is there some other package that is more robust to this kind of situations? How can I deal with this problem?

Answer

phiver picture phiver · Jan 20, 2018

Your 5 columns might have a number of unbalanced factors. When these are turned into dummy variables there is a high probability that you will have one column a linear combination of another. The default imputation methods of mice involve linear regression, this results in a X matrix that cannot be inverted and will result in your error.

Change the method being used to something else like cart -- mice(data_2, method = "cart") --. Also check which seed you are calling before / during imputation for reproducible results.

My advice is to go through the 7 vignettes of mice. You can find out how to change the method of imputation being used for separate columns instead of for the whole dataset.