How to plot mean and standard error in Boxplot in R

Amy WP Liu picture Amy WP Liu · Sep 23, 2014 · Viewed 18.1k times · Source

I have two categorical factors ('Habitat' and 'Locality'), and one continuous variable (T). 'Habitat' has two level and 'Locality' has eight levels. I want to change the default whiskers to represent the SE, and the median into the mean for each boxplot. Is there a way to do this and taking both of the categorical factors into account when plotting? Many thanks in advance.

This is what I have done with the default setting of boxplot ggplot, showing the first and third quartiles with median intervals.

ggplot(data,aes(x=Locality,y=T)) + 
  geom_boxplot(aes(fill=interaction(Habitat,Locality), 
                   group=interaction(factor(Habitat),Locality)),
               outlier.shape=1,outlier.size=3) + 
  theme_bw() + 
  theme(
    panel.grid.major=element_blank(),
    panel.grid.minor=element_blank(),
    axis.line=element_line(colour='black'),
    legend.position='none',
    axis.text.x=element_text(angle=90,hjust=1,size=12)) + 
  scale_y_continuous('T') + 
  xlab('Locality')

Answer

Masato Nakazawa picture Masato Nakazawa · Sep 23, 2014

First write a function that compute the min, mean-1SEM, mean, mean+1SEM, and Max. Then map these 5 values onto a boxplot using stat_summary.

library(gridExtra)
library(ggplot2)

MinMeanSEMMax <- function(x) {
  v <- c(min(x), mean(x) - sd(x)/sqrt(length(x)), mean(x), mean(x) + sd(x)/sqrt(length(x)), max(x))
  names(v) <- c("ymin", "lower", "middle", "upper", "ymax")
  v
}

g1 <- ggplot(mtcars, aes(factor(am), mpg)) + geom_boxplot() +
  ggtitle("Regular Boxplot")

g2 <- ggplot(mtcars, aes(factor(am), mpg)) +
  stat_summary(fun.data=MinMeanSEMMax, geom="boxplot", colour="red") + 
  ggtitle("Boxplot: Min, Mean-1SEM, Mean, Mean+1SEM, Max")


grid.arrange(g1, g2, ncol=2)

enter image description here