Converting data frame into Time Series using R

Abhishek picture Abhishek · Nov 25, 2016 · Viewed 8.2k times · Source

I have a time series data of the format

               Time Ask Bid Trade Ask_Size Bid_Size Trade_Size
2016-11-01 09:00:12  NA 901    NA       NA      100         NA
2016-11-01 09:00:21  NA  NA   950       NA       NA          5
2016-11-01 09:00:21  NA 950    NA       NA        5         NA
2016-11-01 09:00:21 905  NA    NA       10       NA         NA
2016-11-01 09:00:24  NA 921    NA       NA      500         NA
2016-11-01 09:00:28  NA 879    NA       NA        2         NA

The structure of the dataframe is

 str(df)

'data.frame':   35797 obs. of  7 variables:
 $ Time      : POSIXct, format: "2016-11-01 09:00:12" "2016-11-01 09:00:21" ...
 $ Ask       : num  NA NA NA 905 NA NA 1040 NA NA 905 ...
 $ Bid       : num  901 NA 950 NA 921 879 NA NA 950 NA ...
 $ Trade     : num  NA 950 NA NA NA NA NA 950 NA NA ...
 $ Ask_Size  : num  NA NA NA 10 NA NA 6 NA NA 10 ...
 $ Bid_Size  : num  100 NA 5 NA 500 2 NA NA 5 NA ...
 $ Trade_Size: num  NA 5 NA NA NA NA NA 5 NA NA ...

I am trying to convert it to Time Series using the code

library(zoo)
library(xts)
library(lubridate)

df_ts <- xts(x = df, order.by = df$Time)

but am getting weird output as

                    Time                    Ask       Bid      Trade Ask_Size Bid_Size Trade_Size
2016-11-01 01:00:03 "2016-11-01 01:00:03"   NA        "938.10" NA    NA       " 203"   NA        
2016-11-01 01:00:04 "2016-11-01 01:00:04"   NA        "937.20" NA    NA       " 100"   NA        
2016-11-01 01:00:04 "2016-11-01 01:00:04" " 938.00"    NA       NA    "  28"   NA       NA        
2016-11-01 01:00:04 "2016-11-01 01:00:04"   NA        "938.10" NA    NA       " 203"   NA        
2016-11-01 01:00:04 "2016-11-01 01:00:04" " 939.00" NA       NA    "  11"   NA       NA        
2016-11-01 01:00:05 "2016-11-01 01:00:05"   NA        "938.15" NA    NA       "  19"   NA  

The time in the column "Time" is appearing twice and also the starting time is from 1:00 pm. The order of the time is not as per the original dataformat. (The starting time of the original dataframe is from 9:00 am). Please help.

Answer

knb picture knb · Nov 25, 2016

Try this:

df_ts <- as.xts(x = df[, -1], order.by = df$Time)

Needless to say, this skips the first column.