I have a XTS
dataset that contains many stock closing prices called: dataset
. I then wanted to find if their returns have any correlation via cor()
, however I get an error message: Error in cor(RETS) : 'x' must be numeric
.
Here is what i have done:
RETS <- CalculateReturns(dataset, method= c("log")) # Calculate returns Via PerformanceAnalytics
RETS<- na.locf(RETS) #Solves missing NAs by carrying forward last observation
RETS[is.na(RETS)] <- "0" #I then fill the rest of the NAs by adding "0"
Here is a sample of RETS
row.names A.Close AA.Close AADR.Close AAIT.Close AAL.Close
1 2013-01-01 0 0 0 0 0
2 2013-01-02 0.0035 0.0088 0.0044 -0.00842 0
3 2013-01-03 0.0195 0.0207 -0.002848 -0.00494 0
4 2013-01-06 -0.0072 -0.0174 0.0078 -0.00070 0
5 2013-01-07 -0.0080 0 -0.01106 -0.03353 0
6 2013-01-08 0.0266 -0.002200 0.006655 0.0160 0
7 2013-01-09 0.0073 -0.01218 0.007551 0.013620 0
Then I perform the correlation:
#Perform Correlation
cor(RETS) -> correl
Error in cor(RETS1) : 'x' must be numeric
#Tried using as.numeric
cor(as.numeric(RETS), as.numeric(RETS) -> correl
However the answer is "1". I also tried using the correlation function in psych
but get the same error message.
I'm adding @Roland's answer where to close out the question.
The problem is that using
RETS[is.na(RETS)] <- "0"
is turning all the data into characters since adding any character value to a numeric value automatically changes the data.types to a character. Thus when you go to take the correlation, there is no way to do that for character values. So if you simply do
RETS[is.na(RETS)] <- 0
instead, you should avoid the conversion problem.
Rather than setting your missing values to NA
, you might also consider explicitly telling cor
how to handle missing values For example
cor(RETS, use="pairwise.complete.obs")
will only calculate correlation between two variables for those pairs where both are not-NA. See the ?cor
help page for all of the options.