## 14 Jun 2011

### Multiple Comparisons for GLMMs using glmer() & glht()

...here's an example of how to apply multiple comparisons to a generalised linear mixed model (GLMM) using the function glmer from package lme4 & glht() from package multcomp. Also, I present a nice example for visualizing data from a nested sampling design with lattice-plots!

The comparisons: I have a 4-level (stage A, B, C, D) and a 2-level (center vs. periphery) fixed factor and I want to know whether there are differences between center vs. periphery within each stage-level, and whether this effect differs across stage-levels (interactions). For the case being I simulated an effect only for center vs. periphery and this effect was simulated to be the same across all stages, so there should not be any sign. interactions.

This method was used in Cichini et al. (2011).

```library(lme4)
library(mvtnorm)
library(multcomp)
library(lattice)

########################################################################
# setting up fake data, for details on sampling design see:            #
# cichini et al. (2011)                                                #
########################################################################

# 3 successional stages with 44 units and 1 succ. stage wit 10 units:
N = 44*3+10

cen <- per <- cbind(n = rep(NA, N), X = rep(NA, N))

# i simulate an effect between factor-levels cen and per
# all other fixed and random factors and interactions will have
# no effect.
# X = incidents, n = no. of observations:
for(i in 1:N){cen[i,"X"]<-sum(rbinom(5, 1, 0.35)); cen[i, "n"] <- 5}
for(i in 1:N){per[i,"X"]<-sum(rbinom(8, 1, 0.65)); per[i, "n"] <- 8}

dat <- data.frame(rbind(cen, per))

# probabilities:
dat\$p <- dat\$X/dat\$n
stage <- as.factor(rep(c(rep("A",44), rep("B",44), rep("C",44), rep("D",10)),2))
plotA <- paste("A",rep(1:11,c(4,4,4,4,4,4,4,4,4,4,4)),sep="")
plotB <- paste("B",rep(1:13,c(3,4,2,4,4,4,4,4,3,4,4,2,2)),sep="")
plotC <- paste("C",rep(1:12,c(3,4,4,4,4,4,4,4,3,4,4,2)),sep="")
plotD <- paste("D",rep(1:3,c(2,4,4)),sep="")
plot <- as.factor(rep(c(plotA, plotB, plotC, plotD),2))
subplot <-  as.factor(rep(1:142,2))
pos <- as.factor(c(rep("cen", N),rep("per", N)))

df <- data.frame(cbind(dat, pos = pos, stage = stage, plot = plot, subplot = subplot))
str(df)

### visualize data:
stripplot(jitter(p,10) ~ pos | plot, data = df, groups = subplot,
strip = strip.custom(strip.names = c(F,T)),
ylab="p",
layout=c(12,4),
type = c("p", "a"),col=1,cex=0.5,
scales = list(y = list(cex=0.5),
x = list(rot=90,cex=0.6,labels=c("Centers","Periphery"))),
par.strip.text=list(cex=0.7),
page = function(...) {
ltext(x = 0.39,
y = 0.835,
lab = c("CENTERS vs. PERIPHERY\npanels = plots;\nrows = successional stages:\nA: pioneer, B: early, C: late, D: alpine gr.land; \nlines = paired samples (= subplots) "),
})

gmod <- glmer(cbind(dat\$X, dat\$n - dat\$X) ~ stage * pos + (1|plot/subplot), family = binomial)

# comparisons -
# i want to know whether there are differences between center vs. periphery
# within each stage, and whether this effect differs across
# stages (interactions).
# as i simulated an effect only for centre vs. periphery and this
# effect was simulated
# to be the same across all stages, there should not be any sign.
# interactions:

c1 <- rbind("A: Center vs. Peri. Effect" = c(0,0,0,0,1,0,0,0),
"B: Center vs. Peri. Effect" = c(0,0,0,0,1,1,0,0),
"C: Center vs. Peri. Effect" = c(0,0,0,0,1,0,1,0),
"D: Center vs. Peri. Effect" = c(0,0,0,0,1,0,0,1),
"Center vs. Peri. Effect, A vs. B" = c(0,0,0,0,0,1,0,0),
"Center vs. Peri. Effect, A vs. C" = c(0,0,0,0,0,0,1,0),
"Center vs. Peri. Effect, A vs. D" = c(0,0,0,0,0,0,0,1),
"Center vs. Peri. Effect, B vs. C" = c(0,0,0,0,0,-1,1,0),
"Center vs. Peri. Effect, B vs. D" = c(0,0,0,0,0,-1,0,1),
"Center vs. Peri. Effect, C vs. D" = c(0,0,0,0,0,0,-1,1))

summary(glht(gmod, c1))
###########################################################################

# EDIT:
# for investigation of contrasts you can use this functions, i.e.:
stage <- relevel(stage, "B")
model.matrix(~ stage * pos, contrasts = list(stage = "contr.treatment", pos = "contr.treatment"))
```
To cite package ‘lme4’ in publications use:
Douglas Bates, Martin Maechler and Ben Bolker (2011). lme4: Linear mixed-effects models using S4 classes. R package version   0.999375-39. http://CRAN.R-project.org/package=lme4
Please cite the multcomp package by the following reference:
Torsten Hothorn, Frank Bretz and Peter Westfall (2008). Simultaneous   Inference in General Parametric Models. Biometrical Journal 50(3),   346--363.
To cite the lattice package in publications use:
Sarkar, Deepayan (2008) Lattice: Multivariate Data Visualization with R. Springer, New York. ISBN 978-0-387-75968-5

1. Great walkthrough, though I got an error with your glht call ("could not find function "test"). I adjusted the glht call to:

summary(glht(modelEPT, linfct=c1))

adding linfct=c1. I'm not sure this is correct, but the results are very reasonable.

How do you do pairwise comparisons just between the "stages" (i.e. A vs B, A vs. C, A vs D, B vs C, B vs D, C vs D)? My attempt was:

"C vs. D" = c(1,0,0,0,0,0,0,0),
"C vs. F" = c(0,1,0,0,0,0,0,0),
"C vs. G" = c(0,0,1,0,0,0,0,0),
"D vs. F" = c(-1,1,0,0,0,0,0,0),
"D vs. G" = c(-1,0,1,0,0,0,0,1),
"F vs. G" = c(0,-1,1,0,0,0,0,0),

But I am not getting reasonable results using this matrix. I also certainly do not understand how the c1 matrix created with rbind is interpreted by glht, and so dont understand how to adapt the code to do what I want it to (as would others, i presume, trying to adapt code to their different designs).

1. hi,

i can not reproduce your error - everything works fine for me with the above code..

> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
 LC_COLLATE=German_Austria.1252 LC_CTYPE=German_Austria.1252
 LC_MONETARY=German_Austria.1252 LC_NUMERIC=C
 LC_TIME=German_Austria.1252

attached base packages:
 splines stats graphics grDevices utils datasets methods
 base

other attached packages:
 multcomp_1.2-5 survival_2.36-5 mvtnorm_0.9-96 lme4_0.999375-39
 Matrix_0.999375-50 lattice_0.19-23

loaded via a namespace (and not attached):
 grid_2.13.0 nlme_3.1-100 stats4_2.13.0

here you may utilize built-in standard contrasts, see:
help(glht); help(mcp)

the according code would be:
summary(glht(gmod, linfct = mcp(stage = "Tukey")))

for logic & definition of contrasts please see
standard literature.

hth,
kay

2. What is the "standard literature" for logic and contrast definitions?

1. 3. Really helpful example, but I can´t find the recommended paper in order to check what to write in a paper later.
Best
Jay

1. Best,
Kay

4. Thank you very much for this example it worked well for me!
Do you have by any chance tried multiple comparisons for a triple interaction, i.e. time*treatment*group?
I tried to find a hint on this in the multcomp documentation and respective newsgroups but without success....

1. Sry - maybe you should try and experiment with re-levelling your factors and playing with the contrast matrix (admittedly that's what I did, finding the solution by trial and error). BTW I'd discourage you to use 3-way interactions because it gets overly complicated to interpret the results..

5. Ok, I somehow expected this answer.....
But thank you anyway.

6. Guys I have a question:

In the example, how should be the code for "Center Effect: A vs. B"?

1. Sorry for the late answer: It is included in the contrast matrix (line 5)!

2. Kay,

I meant the code for the relation between the plots A vs. B within just the Central Effect.

Because, in the line 5, I understand that: it is the diference of the A vs. B plots between Central vs. Peripheral effects? Did I misunderstood?

Thanks so much!

3. Ok, this you can get just from the main model with 'summary(gmod)', where the intercept, which is the factor-level combination stage=A and position=center, is compared to stage=B, position=center in the 2nd line of the model summary! See the edit of the original post for how to get the contrasts of this model!

7. Isn't glht() only for parametric models, which glmer is not?
The documentation is confusing, it says "General linear hypotheses and multiple comparisons for parametric models, including generalized linear models, linear mixed effects models [...]" but as far as I know, glms are not for parametric data. Also, lmer() is mentioned but not glmer(). Any thoughts, anyone?

8. It's the models, not the data that is meant here! Of course all kind of GLMs are parametric models..