I have a function in which I define a data.frame
that I use loops to fill with data. At some point I get the Warning message:
Warning messages: 1: In
[<-.factor
(*tmp*
, iseq, value = "CHANGE") : invalid factor level, NAs generated
Therefore, when I define my data.frame, I'd like to set the option stringsAsFactors
to FALSE
but I don't understand how to do it.
I have tried:
DataFrame = data.frame(stringsAsFactors=FALSE)
and also:
options(stringsAsFactors=FALSE)
What is the correct way to set the stringsAsFactors option?
It depends on how you fill your data frame, for which you haven't given any code. When you construct a new data frame, you can do it like this:
x <- data.frame(aName = aVector, bName = bVector, stringsAsFactors = FALSE)
In this case, if e.g. aVector
is a character vector, then the dataframe column x$aName
will be a character vector as well, and not a factor vector. Combining that with an existing data frame (using rbind
, cbind
or similar) should preserve that mode.
When you execute
options(stringsAsFactors = FALSE)
you change the global default setting. So every data frame you create after executing that line will not auto-convert to factors unless explicitly told to do so. If you only need to avoid conversion in a single place, then I'd rather not change the default. However if this affects many places in your code, changing the default seems like a good idea.
One more thing: if your vector already contains factors, then neither of the above will change it back into a character vector. To do so, you should explicitly convert it back using as.character
or similar.