how to convert numeric column to factor in R

song0089 picture song0089 · Dec 2, 2013 · Viewed 8.5k times · Source

I'm trying to use the softImpute command (from the softImpute package) for filling in missing values, and I'm trying to turn categorical variables in a large data frame into factor type before using the softImpute.

I've used as.factor command and factor command but they all yield the following

train[a]=factor(train[a])

Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

a here is a vector like: c(1:92)

I tried as.character too but the softImpute command would not recognize the variables as character and would treat them as numeric, resulting in decimal values for categorical/indicator variables.

Answer

IRTFM picture IRTFM · Dec 2, 2013

Try:

train[[a]]=factor(train[[a]])

This does assume, of course that ,a is an object with either a numerical value in the range 1:length(train) or is one of the values in the names(train) vector. If you reference a dataframe using "[" you get a list with one element which happens to be the vector you were hoping to "factorize" but it isn't really a vector but is rather a one element list. The "[[" function instead gives you just the vector.