With "lubridate" package in R, I can find out if two time periods overlapped. but Is there an efficient way to compute for how many days they overlapped. (for instance how many days a women smoked while pregnant. the pregnancy period and smoking period may overlap totally, partially or not at all)
Here is an example with three women:
preg_start<-as.Date(c("2011-01-01","2012-01-01","2013-01-01"))
preg_end<-preg_start+270 # end after 9 months
smoke_start<-as.Date(c("2011-02-01","2012-08-01","2014-01-01"))
smoke_end<-smoke_start+100 # all three smoked 100 days
data<-data.frame(cbind(preg_start,preg_end,smoke_start,smoke_end))
I want to add a variable saying that the first woman smoked 100 days during pregnancy, the second smoked 30 days and the third did not smoke while pregnant.
Use interval
to create time intervals for pregnancy and smoking. Then calculate the intersect
of these intervals. From that you can calculate the period
in days.
library("lubridate")
preg_start<-as.Date(c("2011-01-01","2012-01-01","2013-01-01"))
preg_end<-preg_start+270 # end after 9 months
smoke_start<-as.Date(c("2011-02-01","2012-08-01","2014-01-01"))
smoke_end<-smoke_start+100 # all three smoked 100 days
smoke <- new_interval(smoke_start, smoke_end, tzone="UTC")
preg <- new_interval(preg_start, preg_end, tzone="UTC")
day(as.period(intersect(smoke, preg), "days"))
I get 100, 57 and 0 days of smoking during pregnancy.