I have a problem dealing with time series in R.
#--------------read data
wb = loadWorkbook("Countries_Europe_Prices.xlsx")
df = readWorksheet(wb, sheet="Sheet2")
x <- df$Year
y <- df$Index1
y <- lag(y, 1, na.pad = TRUE)
cbind(x, y)
It gives me the following output:
x y
[1,] 1974 NA
[2,] 1975 50.8
[3,] 1976 51.9
[4,] 1977 54.8
[5,] 1978 58.8
[6,] 1979 64.0
[7,] 1980 68.8
[8,] 1981 73.6
[9,] 1982 74.3
[10,] 1983 74.5
[11,] 1984 72.9
[12,] 1985 72.1
[13,] 1986 72.3
[14,] 1987 71.7
[15,] 1988 72.9
[16,] 1989 75.3
[17,] 1990 81.2
[18,] 1991 84.3
[19,] 1992 87.2
[20,] 1993 90.1
But I want the first value in y to be 50.8 and so forth. In other words, I want to get a negative lag. I don't get it, how can I do it?
My problem is very similar to this problem, but however I cannot solve it. I guess I still do not understand the solution(s)...
How about the built-in 'lead' function? (from the dplyr package) Doesn't it do exactly the job of Ahmed's function?
cbind(x, lead(y, 1))
If you want to be able to calculate either positive or negative lags in the same function, i suggest a 'shorter' version of his 'shift' function:
shift = function(x, lag) {
require(dplyr)
switch(sign(lag)/2+1.5, lead(x, abs(lag)), lag(x, abs(lag)))
}
What it does is creating 2 cases, one with lag the other with lead, and chooses one case depending on the sign of your lag (the +1.5 is a trick to transform a {-1, +1} into a {1, 2} alternative).