In a dataset with multiple observations for each subject I want to take a subset with only the maximum data value for each record. For example, with a following dataset:
ID <- c(1,1,1,2,2,2,2,3,3)
Value <- c(2,3,5,2,5,8,17,3,5)
Event <- c(1,1,2,1,2,1,2,2,2)
group <- data.frame(Subject=ID, pt=Value, Event=Event)
Subject 1, 2, and 3 have the biggest pt value of 5, 17, and 5 respectively.
How could I first find the biggest pt value for each subject, and then, put this observation in another data frame? The resulting data frame should only have the biggest pt values for each subject.
Here's a data.table
solution:
require(data.table) ## 1.9.2
group <- as.data.table(group)
If you want to keep all the entries corresponding to max values of pt
within each group:
group[group[, .I[pt == max(pt)], by=Subject]$V1]
# Subject pt Event
# 1: 1 5 2
# 2: 2 17 2
# 3: 3 5 2
If you'd like just the first max value of pt
:
group[group[, .I[which.max(pt)], by=Subject]$V1]
# Subject pt Event
# 1: 1 5 2
# 2: 2 17 2
# 3: 3 5 2
In this case, it doesn't make a difference, as there aren't multiple maximum values within any group in your data.