I am doing a basic boxplot where y=age
and x=Patient groups
age <- ggplot(data, aes(factor(group2), age)) + ylim(15, 80)
age + geom_boxplot(fill = "grey80", colour = "#3366FF")
I was hoping you could help me out with a few things:
1) Is it possible to include a number of observations per group above each group boxplot (but NOT on the X axis where my group labels are) without having to do this in paint :)? I have tried using:
age + annotate("text", x = "CON", y = 60, label = "25")
where CON
is the 1st group and y = 60
is ~ just above the boxplot for this group. However, the command didn't work. I assume it has something to do that it reads x
as a continuous rather than a categorical variable.
2) Also although there are plenty of questions about using the mean rather than the median for the boxplots, I still haven`t found a code that works for me?
3) On the same matter is there a way you could include the mean group stat in the boxplot? Perhaps using
age + stat_summary(fun.y=mean, colour="red", geom="point")
which however only includes a dot of where the mean lies. Or again using
age + annotate("text", x = "CON", y = 30, label = "30")
where CON
is the 1st group and y = 30
is ~ the group age mean.
Knowing how flexible and rich ggplot2
syntax is I was hoping that there is a more elegant way of using the real stats output rather than annotate
.
Any suggestions/links would be much appreciated!
Thanks!!
Is this anything like what you're after? With stat_summary
, as requested:
# function for number of observations
give.n <- function(x){
return(c(y = median(x)*1.05, label = length(x)))
# experiment with the multiplier to find the perfect position
}
# function for mean labels
mean.n <- function(x){
return(c(y = median(x)*0.97, label = round(mean(x),2)))
# experiment with the multiplier to find the perfect position
}
# plot
ggplot(mtcars, aes(factor(cyl), mpg, label=rownames(mtcars))) +
geom_boxplot(fill = "grey80", colour = "#3366FF") +
stat_summary(fun.data = give.n, geom = "text", fun.y = median) +
stat_summary(fun.data = mean.n, geom = "text", fun.y = mean, colour = "red")
Black number is number of observations, red number is mean value. joran's answer shows you how to put the numbers at the top of the boxes