Replace missing values with column mean

Nikita picture Nikita · Sep 14, 2014 · Viewed 143.9k times · Source

I am not sure how to loop over each column to replace the NA values with the column mean. When I am trying to replace for one column using the following, it works well.

Column1[is.na(Column1)] <- round(mean(Column1, na.rm = TRUE))

The code for looping over columns is not working:

for(i in 1:ncol(data)){
    data[i][is.na(data[i])] <- round(mean(data[i], na.rm = TRUE))
}

the values are not replaced. Can someone please help me with this?

Answer

Thomas picture Thomas · Sep 14, 2014

A relatively simple modification of your code should solve the issue:

for(i in 1:ncol(data)){
  data[is.na(data[,i]), i] <- mean(data[,i], na.rm = TRUE)
}