understanding dates/times (POSIXc and POSIXct) in R

mariotomo picture mariotomo · Nov 26, 2009 · Viewed 8.1k times · Source

I'm reading a table and it contains strings that describe timestamps. I just want to convert from string to a built-in datetime type...

R> Q <- read.table(textConnection('
               tsstring
1 "2009-09-30 10:00:00"
2 "2009-09-30 10:15:00"
3 "2009-09-30 10:35:00"
4 "2009-09-30 10:45:00"
5 "2009-09-30 11:00:00"
'), as.is=TRUE, header=TRUE)
R> ts <- strptime(Q$tsstring, "%Y-%m-%d %H:%M:%S", tz="UTC")

if I try to store the datetime column into the data.frame, I get a curious error:

R> Q$ts <- ts
Error in `$<-.data.frame`(`*tmp*`, "ts", value = list(sec = c(0, 0, 0,  : 
  replacement has 9 rows, data has 5

but if I go through a numeric representation held in the data.frame, it works...

R> EPOCH <- strptime("1970-01-01 00:00:00", "%Y-%m-%d %H:%M:%S", tz="UTC")
R> Q$minutes <- as.numeric(difftime(ts, EPOCH, tz="UTC"), units="mins")
R> Q$ts <- EPOCH + 60*Q$minutes

any help in understanding the situation?

Answer

rcs picture rcs · Nov 26, 2009

strptime returns class POSIXlt, you need POSIXct in the data frame:

R> class(strptime("2009-09-30 10:00:00", "%Y-%m-%d %H:%M:%S", tz="UTC"))
[1] "POSIXt"  "POSIXlt"
R> class(as.POSIXct("2009-09-30 10:00:00", "%Y-%m-%d %H:%M:%S", tz="UTC"))
[1] "POSIXt"  "POSIXct"

Class POSIXct represents the (signed) number of seconds since the beginning of 1970 as a numeric vector. Class POSIXlt is a named list of vectors representing sec, min, hour, mday, mon, year, etc.

R> unclass(strptime("2009-09-30 10:00:00", "%Y-%m-%d %H:%M:%S", tz="UTC"))
$sec
[1] 0
$min
[1] 0
$hour
[1] 10
$mday
[1] 30
$mon
[1] 8
$year
[1] 109
$wday
[1] 3
$yday
[1] 272
$isdst
[1] 0
attr(,"tzone")
[1] "UTC"

R> unclass(as.POSIXct("2009-09-30 10:00:00", "%Y-%m-%d %H:%M:%S", tz="UTC"))
[1] 1.254e+09
attr(,"tzone")
[1] "UTC"