I have an experiment that is unbalanced where at three sites (L, M, H) we measure a parameter (met
) in four different vegetation types (a, b, c, d). All vegetation types are present at all three sites. Vegetation types are replicated 4 times at L and M and 8 times at H.
Therefore a simple anova and TukeyHSD will not work. Packages Agricolae (HSD.test
) and DTK (DTK.test
) are only working for one way designs, and then there is multcomp... Does the Tukey test in the mcp
function calculate Tukey-Kramer contrasts, or does it give the regular Tukey contrasts? I presume the first to be the case because the package is geared towards testing multiple comparisons for unbalanced designs, but I am unsure because p-values produced with both approaches are virtually the same. What test would then be appropriate?
Also, are there more suitable approaches towards doing such a two way anova for unbalanced data sets?
library(multcomp)
(met <- c(rnorm(16,6,2),rnorm(16,5,2),rnorm(32,4,2)))
(site <- c(rep("L", 16), rep("M", 16), rep("H", 32)))
(vtype <- c(rep(letters[1:4], 16), rep(letters[1:4], 16), rep(letters[1:4], 32)))
dat <- data.frame(site, vtype, met)
# using aov and TukeyHSD
aov.000 <- aov(met ~ site * vtype, data=dat)
summary(aov.000)
TukeyHSD(aov.000)
# using Anova, and multcomp
lm.000 <- lm(met ~ site * vtype, data=dat)
summary(lm.000)
library(car)
Anova.000 <- Anova(lm.000, data=dat)
dat$int <- with(dat, interaction(site, vtype, sep = "x"))
lm.000 <- lm(met ~ int, data = dat)
summary(lm.000)
summary(glht.000 <- glht(lm.000, linfct = mcp(int = "Tukey")))
For unbalanced data, anova with type III SS may be used instead of type I SS [1]. Calculation of type III anova in R [2]:
model <- (met ~ site * vtype)
defopt <- options()
options(contrasts=c("contr.sum", "contr.poly"))
print(drop1(aov(model),~.,test="F"))
options <- defopt
For unbalanced data, pairwise comparisons of adjusted means may be used. Calculation in R [4]:
library(lsmeans)
print(lsmeans(model, list(pairwise ~ site)), adjust = c("tukey"))
print(lsmeans(model, list(pairwise ~ vtype)), adjust = c("tukey"))
print(lsmeans(model, list(pairwise ~ site | vtype)), adjust = c("tukey"))
print(lsmeans(model, list(pairwise ~ vtype | site)), adjust = c("tukey"))
Lines 2 and 3 compare levels of main effects "site" and "vytpe". Lines 4 and 5 compare levels of one factor at each level of another factor separately.
I hope this helps.
References
[1] Miliken and Johnsen. 2009. Analysis of messy data. Volume 1.
[2] http://www.statmethods.net/stats/anova.html
[3] http://cran.r-project.org/web/packages/lsmeans/vignettes/using-lsmeans.pdf