I am unsure as to why the error bars generated by the mean_sdl function (from Hmisc) in ggplot2 are significantly broader than the error bars generated manually and plotting mean + sd and mean - sd. My code:
library(drc)
library(tidyverse)
test_dataset <-
structure(
list(
X = c(1e-10, 1e-08, 3e-08, 1e-07, 3e-07, 1e-06, 3e-06, 1e-05, 3e-05, 1e-04, 3e-04),
AY1 = c(0, 11, 125, 190, 258, 322, 354, 348, NA, 412, NA),
AY2 = c(3, 33, 141, 218, 289, 353, 359, 298, NA, 378, NA),
AY3 = c(2, 25, 160, 196, 345, 328, 369, 372, NA, 399, NA),
BY1 = c(3, NA, 11, 52, 80, 171, 289, 272, 359, 352, 389),
BY2 = c(5, NA, 25, 55, 77, 195, 230, 333, 306, 320, 338),
BY3 = c(4, NA, 28, 61, 44, 246, 243, 310, 297, 365, NA)
),
class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA,-11L),
.Names = c("X", "AY1", "AY2", "AY3", "BY1", "BY2", "BY3")
)
test_dataset2 <- test_dataset %>%
rename(conc = X) %>%
gather(-conc, key = "measurement", value = "signal") %>%
separate(col = measurement, into = c("mAb", "rep"), sep = "Y")
plot_with_mean_sdl <- ggplot(test_dataset2, aes(x = conc, y = signal, col = mAb)) +
scale_x_log10() +
stat_summary(fun.data = mean_se,
geom = "point",
size = 2
) +
# geom_errorbar(data = (test_dataset2 %>% group_by(mAb, conc) %>%
# summarise(AVG = mean(signal), SD = sd(signal)) %>%
# dplyr::filter(AVG != "NA") %>%
# mutate(top = AVG + SD, bottom = AVG - SD)), aes(x = conc, y = AVG, ymin = bottom, ymax = top)) +
stat_summary(fun.data = mean_sdl, geom = "errorbar") +
stat_smooth(method = "drm",
method.args=list(fct = L.4()),
se = F,
n = 300
)
plot_with_manual_errorbars <- ggplot(test_dataset2, aes(x = conc, y = signal, col = mAb)) +
scale_x_log10() +
stat_summary(fun.data = mean_se,
geom = "point",
size = 2
) +
geom_errorbar(data = (test_dataset2 %>% group_by(mAb, conc) %>%
summarise(AVG = mean(signal), SD = sd(signal)) %>%
dplyr::filter(AVG != "NA") %>%
mutate(top = AVG + SD, bottom = AVG - SD)), aes(x = conc, y = AVG, ymin = bottom, ymax = top)) +
# stat_summary(fun.data = mean_sdl, geom = "errorbar") +
stat_smooth(method = "drm",
method.args=list(fct = L.4()),
se = F,
n = 300
)
I thought the smean_sdl function from the Hmisc package was supposed to plot the mean +/- a constant number of standard deviations from the mean. What am I not getting right?
Thanks.
From ?smean.sd
(also linked on ?hmisc
) :
smean.sdl
computes the mean plus or minus a constant times the standard deviation.
And:
smean.sdl(x, mult=2, na.rm=TRUE)
So the default appears to be 2 standard deviations.
Use fun.args = list(mult = 1)
as shown in the examples for stat_summary
.