I've read this question here: Group numeric values by the intervals
However, I would like to output a numeric (rather than a factor), specifically the numeric value of the lower and/or upper bounds (in separate columns)
In essence, this is right, except that the 'df$start' and 'df$end' are given as factors:
df$start <- cut(df$x,
breaks = c(0,25,75,125,175,225,299),
labels = c(0,25,75,125,175,225),
right = TRUE)
df$end <- cut(df$x,
breaks = c(0,25,75,125,175,225,299),
labels = c(25,75,125,175,225,299),
right = TRUE)
The use of 'as.numeric()' returns the level of the factor (i.e. values 1-6) rather than the original numbers.
Thanks!
Much of the behavior of cut
is related to creating the labels that you're not interested in. You're probably better off using findInterval
or .bincode
.
You would start with the data
set.seed(17)
df <- data.frame(x=300 * runif(100))
Then set the breaks and find the intervals:
breaks <- c(0,25,75,125,175,225,299)
df$interval <- findInterval(df$x, breaks)
df$start <- breaks[df$interval]
df$end <- breaks[df$interval + 1]