Imputation in R

Mehrdad Rohani picture Mehrdad Rohani · Oct 29, 2012 · Viewed 19.9k times · Source

I am new in R programming language. I just wanted to know is there any way to impute null values of just one column in our dataset. Because all of imputation commands and libraries that I have seen, impute null values of the whole dataset.

Answer

mnel picture mnel · Oct 29, 2012

Here is an example using the Hmisc package and impute

library(Hmisc)
DF <- data.frame(age = c(10, 20, NA, 40), sex = c('male','female'))

# impute with mean value

DF$imputed_age <- with(DF, impute(age, mean))

# impute with random value
DF$imputed_age2 <- with(DF, impute(age, 'random'))

# impute with the media
with(DF, impute(age, median))
# impute with the minimum
with(DF, impute(age, min))

# impute with the maximum
with(DF, impute(age, max))


# and if you are sufficiently foolish
# impute with number 7 
with(DF, impute(age, 7))

 # impute with letter 'a'
with(DF, impute(age, 'a'))

Look at ?impute for details on how the imputation is implemented