subtract value from previous row by group

haitham picture haitham · Jun 2, 2015 · Viewed 46.1k times · Source

In R, let's say I have this data frame:

Data
id      date        value
2380    10/30/12    21.01
2380    10/31/12    22.04
2380    11/1/12     22.65
2380    11/2/12     23.11
20100   10/30/12    35.21
20100   10/31/12    37.07
20100   11/1/12     38.17
20100   11/2/12     38.97
20103   10/30/12    57.98
20103   10/31/12    60.83 

And I want to subtract the previous value from the current value, by group ID date, to create this:

id      date        value   diff
2380    10/30/12    21.01   0
2380    10/31/12    22.04   1.03
2380    11/1/12     22.65   0.61
2380    11/2/12     23.11   0.46
20100   10/30/12    35.21   0
20100   10/31/12    37.07   1.86
20100   11/1/12     38.17   1.1
20100   11/2/12     38.97   0.8
20103   10/30/12    57.98   0
20103   10/31/12    60.83   2.85

Answer

zero323 picture zero323 · Jun 2, 2015

With dplyr:

library(dplyr)

data %>%
    group_by(id) %>%
    arrange(date) %>%
    mutate(diff = value - lag(value, default = first(value)))

For clarity you can arrange by date and grouping column (as per comment by lawyer)

data %>%
    group_by(id) %>%
    arrange(date, .by_group = TRUE) %>%
    mutate(diff = value - lag(value, default = first(value)))

or lag with order_by:

data %>%
    group_by(id) %>%
    mutate(diff = value - lag(value, default = first(value), order_by = date))

With data.table:

library(data.table)

dt <- as.data.table(data)
setkey(dt, id, date)
dt[, diff := value - shift(value, fill = first(value)), by = id]