14 Jun 2011

Multiple Comparisons for GLMMs using glmer() & glht()

...here's an example of how to apply multiple comparisons to a generalised linear mixed model (GLMM) using the function glmer from package lme4 & glht() from package multcomp. Also, I present a nice example for visualizing data from a nested sampling design with lattice-plots! 

The comparisons: I have a 4-level (stage A, B, C, D) and a 2-level (center vs. periphery) fixed factor and I want to know whether there are differences between center vs. periphery within each stage-level, and whether this effect differs across stage-levels (interactions). For the case being I simulated an effect only for center vs. periphery and this effect was simulated to be the same across all stages, so there should not be any sign. interactions.

This method was used in Cichini et al. (2011).

library(lme4)
library(mvtnorm)
library(multcomp)
library(lattice)

########################################################################
# setting up fake data, for details on sampling design see:            #
# cichini et al. (2011)                                                #
# http://www.springerlink.com/content/t4u305045t214882                 #
########################################################################

# 3 successional stages with 44 units and 1 succ. stage wit 10 units:
N = 44*3+10

cen <- per <- cbind(n = rep(NA, N), X = rep(NA, N))

# i simulate an effect between factor-levels cen and per
# all other fixed and random factors and interactions will have
# no effect.
# X = incidents, n = no. of observations:
for(i in 1:N){cen[i,"X"]<-sum(rbinom(5, 1, 0.35)); cen[i, "n"] <- 5}
for(i in 1:N){per[i,"X"]<-sum(rbinom(8, 1, 0.65)); per[i, "n"] <- 8}

dat <- data.frame(rbind(cen, per)) 

# probabilities:
dat$p <- dat$X/dat$n
stage <- as.factor(rep(c(rep("A",44), rep("B",44), rep("C",44), rep("D",10)),2))
plotA <- paste("A",rep(1:11,c(4,4,4,4,4,4,4,4,4,4,4)),sep="")
plotB <- paste("B",rep(1:13,c(3,4,2,4,4,4,4,4,3,4,4,2,2)),sep="")
plotC <- paste("C",rep(1:12,c(3,4,4,4,4,4,4,4,3,4,4,2)),sep="")
plotD <- paste("D",rep(1:3,c(2,4,4)),sep="")
plot <- as.factor(rep(c(plotA, plotB, plotC, plotD),2))
subplot <-  as.factor(rep(1:142,2))
pos <- as.factor(c(rep("cen", N),rep("per", N)))

df <- data.frame(cbind(dat, pos = pos, stage = stage, plot = plot, subplot = subplot))
str(df)

### visualize data:
stripplot(jitter(p,10) ~ pos | plot, data = df, groups = subplot,
strip = strip.custom(strip.names = c(F,T)),
ylab="p",
layout=c(12,4),
type = c("p", "a"),col=1,cex=0.5,
scales = list(y = list(cex=0.5),
              x = list(rot=90,cex=0.6,labels=c("Centers","Periphery"))),
              par.strip.text=list(cex=0.7),
     page = function(...) {
          ltext(x = 0.39,
                y = 0.835,
                lab = c("CENTERS vs. PERIPHERY\npanels = plots;\nrows = successional stages:\nA: pioneer, B: early, C: late, D: alpine gr.land; \nlines = paired samples (= subplots) "),
                cex = 1,adj=0)
      })

gmod <- glmer(cbind(dat$X, dat$n - dat$X) ~ stage * pos + (1|plot/subplot), family = binomial)

# comparisons -
# i want to know whether there are differences between center vs. periphery
# within each stage, and whether this effect differs across
# stages (interactions).
# as i simulated an effect only for centre vs. periphery and this
# effect was simulated
# to be the same across all stages, there should not be any sign.
# interactions:

c1 <- rbind("A: Center vs. Peri. Effect" = c(0,0,0,0,1,0,0,0),
            "B: Center vs. Peri. Effect" = c(0,0,0,0,1,1,0,0),
            "C: Center vs. Peri. Effect" = c(0,0,0,0,1,0,1,0),
            "D: Center vs. Peri. Effect" = c(0,0,0,0,1,0,0,1),
            "Center vs. Peri. Effect, A vs. B" = c(0,0,0,0,0,1,0,0),
            "Center vs. Peri. Effect, A vs. C" = c(0,0,0,0,0,0,1,0),
            "Center vs. Peri. Effect, A vs. D" = c(0,0,0,0,0,0,0,1),
            "Center vs. Peri. Effect, B vs. C" = c(0,0,0,0,0,-1,1,0),
            "Center vs. Peri. Effect, B vs. D" = c(0,0,0,0,0,-1,0,1),
            "Center vs. Peri. Effect, C vs. D" = c(0,0,0,0,0,0,-1,1))

summary(glht(gmod, c1))
###########################################################################

# EDIT:
# for investigation of contrasts you can use this functions, i.e.:
stage <- relevel(stage, "B")
model.matrix(~ stage * pos, contrasts = list(stage = "contr.treatment", pos = "contr.treatment"))

15 comments :

  1. Great walkthrough, though I got an error with your glht call ("could not find function "test"). I adjusted the glht call to:

    summary(glht(modelEPT, linfct=c1))

    adding linfct=c1. I'm not sure this is correct, but the results are very reasonable.

    How do you do pairwise comparisons just between the "stages" (i.e. A vs B, A vs. C, A vs D, B vs C, B vs D, C vs D)? My attempt was:

    "C vs. D" = c(1,0,0,0,0,0,0,0),
    "C vs. F" = c(0,1,0,0,0,0,0,0),
    "C vs. G" = c(0,0,1,0,0,0,0,0),
    "D vs. F" = c(-1,1,0,0,0,0,0,0),
    "D vs. G" = c(-1,0,1,0,0,0,0,1),
    "F vs. G" = c(0,-1,1,0,0,0,0,0),

    But I am not getting reasonable results using this matrix. I also certainly do not understand how the c1 matrix created with rbind is interpreted by glht, and so dont understand how to adapt the code to do what I want it to (as would others, i presume, trying to adapt code to their different designs).

    ReplyDelete
    Replies
    1. hi,

      i can not reproduce your error - everything works fine for me with the above code..

      > sessionInfo()
      R version 2.13.0 (2011-04-13)
      Platform: i386-pc-mingw32/i386 (32-bit)

      locale:
      [1] LC_COLLATE=German_Austria.1252 LC_CTYPE=German_Austria.1252
      [3] LC_MONETARY=German_Austria.1252 LC_NUMERIC=C
      [5] LC_TIME=German_Austria.1252

      attached base packages:
      [1] splines stats graphics grDevices utils datasets methods
      [8] base

      other attached packages:
      [1] multcomp_1.2-5 survival_2.36-5 mvtnorm_0.9-96 lme4_0.999375-39
      [5] Matrix_0.999375-50 lattice_0.19-23

      loaded via a namespace (and not attached):
      [1] grid_2.13.0 nlme_3.1-100 stats4_2.13.0


      commenting your comparisons of stage-levels:

      here you may utilize built-in standard contrasts, see:
      help(glht); help(mcp)

      the according code would be:
      summary(glht(gmod, linfct = mcp(stage = "Tukey")))

      for logic & definition of contrasts please see
      standard literature.

      hth,
      kay

      Delete
  2. What is the "standard literature" for logic and contrast definitions?

    ReplyDelete
    Replies
    1. see this, i.e.: http://books.google.com/books?uid=102215719968838886223&as_coll=3&hl=en&output=html_text&source=gbs_lp_bookshelf_list

      Delete
  3. Really helpful example, but I can´t find the recommended paper in order to check what to write in a paper later.
    Best
    Jay

    ReplyDelete
    Replies
    1. I added the link to the post!
      Best,
      Kay

      Delete
  4. Thank you very much for this example it worked well for me!
    Do you have by any chance tried multiple comparisons for a triple interaction, i.e. time*treatment*group?
    I tried to find a hint on this in the multcomp documentation and respective newsgroups but without success....

    ReplyDelete
    Replies
    1. Sry - maybe you should try and experiment with re-levelling your factors and playing with the contrast matrix (admittedly that's what I did, finding the solution by trial and error). BTW I'd discourage you to use 3-way interactions because it gets overly complicated to interpret the results..

      Delete
  5. Ok, I somehow expected this answer.....
    But thank you anyway.

    ReplyDelete
  6. Guys I have a question:

    In the example, how should be the code for "Center Effect: A vs. B"?

    ReplyDelete
    Replies
    1. Sorry for the late answer: It is included in the contrast matrix (line 5)!

      Delete
    2. Kay,

      I meant the code for the relation between the plots A vs. B within just the Central Effect.

      Because, in the line 5, I understand that: it is the diference of the A vs. B plots between Central vs. Peripheral effects? Did I misunderstood?

      Thanks so much!

      Delete
    3. Ok, this you can get just from the main model with 'summary(gmod)', where the intercept, which is the factor-level combination stage=A and position=center, is compared to stage=B, position=center in the 2nd line of the model summary! See the edit of the original post for how to get the contrasts of this model!

      Delete
  7. Isn't glht() only for parametric models, which glmer is not?
    The documentation is confusing, it says "General linear hypotheses and multiple comparisons for parametric models, including generalized linear models, linear mixed effects models [...]" but as far as I know, glms are not for parametric data. Also, lmer() is mentioned but not glmer(). Any thoughts, anyone?

    ReplyDelete
  8. It's the models, not the data that is meant here! Of course all kind of GLMs are parametric models..

    ReplyDelete