- Understand the structural components of generalized linear mixed models
- Understand the options for fitting GLMMs and their pros and cons
- Understand some of the diagnostics available for evaluating GLMM fits
27 May 2026
GLMMs combine the flexibility of non-normal distributions (GLMs) with the ability to address correlations among observations and nested data structures (LMMs)
Good news
Bad news
these models are on the frontier of statistical research
existing documentation is rather technical
multiple approaches for fitting models; some with different results
Just like GLMs, GLMMs have three components:
Distribution of the data \(f(y; \theta)\)
Link function \(g(\eta)\)
Linear predictor \(\eta\)
Just like GLMs, we can assume a variety of data distributions
Gaussian: continuous real values
gamma: continuous positive values
Bernoulli: binary outcomes
binomial: discrete successes/trials
Poisson: non-negative counts
negative binomial: non-negative counts with extra variation
zero-inflated Poisson & negative binomial
We can write the linear predictor for GLMs as
\[ \eta = \beta_0 + \beta_1 x_1 + ... + \beta_k x_k \\ \Downarrow \\ \eta = \mathbf{X} \boldsymbol{\beta} \]
where the \(\beta_i\) are fixed effects of factors or covariates \(x_i\)
For GLMMs, our linear predictor also includes random effects
\[ \eta = \beta_0 + \beta_1 x_1 + \dots + \beta_k x_k + \alpha_0 + \alpha_1 z_1 + \dots + \alpha_l z_l\\ \Downarrow \\ \eta = \mathbf{X} \boldsymbol{\beta} + \mathbf{Z} \boldsymbol{\alpha} \]
where the \(\beta_i\) are fixed effects of factors or covariates \(x_i\)
and \(\alpha_i\) are random effects of factors or covariates \(z_i\)
Survival of fish \(i\) in some location \(j\) \((s_{i,j})\) as a function of its length at the time of tagging \((x_{i,j})\) and a random effect for location
\[ \begin{aligned} \text{data distribution:} & ~~~ y_{i,j} \sim \text{Bernoulli}(s_{i,j}) \\ \text{link function:} & ~~~ \text{logit}(s_{i,j}) = \text{log}\left(\frac{s_{i,j}}{1-s_{i,j}}\right) = \mu_{i,j} \\ \text{linear model:} & ~~~ \mu_{i,j} = (\beta_0 + \alpha_j) + \beta_1x_{i,j} \\ & ~~~ \alpha_j \sim \text{N}(0, \sigma_{\delta}^2) \end{aligned} \]
Note: fitting GLMMs is complex
Why? Because GLMMs involve solving an integral with no analytical solution
Note: fitting GLMMs is complex
Why? Because GLMMs involve solving an integral with no analytical solution
Recall that we think of likelihoods in terms of the observed data, but
the random effects in our model are unobserved random variables
Note: fitting GLMMs is complex
Why? Because GLMMs involve solving an integral with no analytical solution
Recall that we think of likelihoods in terms of the observed data, but
the random effects in our model are unobserved random variables
So we need to integrate them out of the likelihood
Best practices suggest we try to keep things simple - lots and lots of random effects become hard to handle
The likelihood for a GLMM involves integrating over all possible random effects
\[ \mathcal{L}(y; \boldsymbol{\beta}, \phi, \boldsymbol{\nu}) = \prod_{i} \int \underbrace{f_d(y; \boldsymbol{\beta}, \phi, \boldsymbol{\alpha})}_{\text{distn for data}} ~ \underbrace{f_r(\boldsymbol{\alpha}; \boldsymbol{\nu})}_{\text{distn for RE}} d \boldsymbol{\alpha} \]
If \(f(y; \boldsymbol{\beta}, \phi, \boldsymbol{\alpha})\) is not Gaussian, we cannot remove it from the likelihood, which makes it very difficult to compute
To avoid the integral, we will consider 3 methods that approximate the likelihood
They all have pros and cons so it’s not possible to pick the “best”
Produces linearized version of the response & then uses LMM methods to fit it
Pros
Cons
not a true likelihood, making inference challenging
only asymptotically correct
biased for binomial and Poisson with small samples
Uses Laplace approximation for computing integrals of the form \(\int e^{h(x)} d x\), which is a good approximation of the integral in a GLMM likelihood
Pros
approximation of true likelihood rather than quasi-likelihood
more accurate than PQL
Cons
slower and less flexible than PQL
may be impossible to compute for complex models
An expansion of Laplace approximation where the integrand is evaluated at more than one point
Pros
Cons
Slow and computationally intense
Limited to a few random effects (one in practice)
Sampling from an estimated posterior distribution rather than computing the integral
This is what is done in Bayesian estimation
MCMC has the ability to handle much more complex random effects
Let’s consider a long-term study of invasive brown tree snakes in Guam
Introduced to the island shortly after WWII
Voracious predators on native birds and other vertebrates
Our data consist of counts of the number of eggs per female at 10 locations over 13 years
We are interested in the fixed effect of body size and the random effects of location and year
We’ll begin with only the effects of body size and location
## fit model via penalized quasi-likelihood
fit_pql <- MASS::glmmPQL(eggs ~ size, random = ~ 1|loc,
family = poisson,
data = df_eggs)
## load {lme4}
library(lme4)
## set control parameters for model fitting
cntrl <- glmerControl(optimizer = "bobyqa", tol = 1e-4, optCtrl = list(maxfun = 100000))
## fit model via Laplace approximation
fit_lap <- glmer(eggs ~ size + (1|loc),
family = poisson,
data = df_eggs,
control = cntrl)
## fit model via Gauss-Hermite quadrature
fit_ghq <- glmer(eggs ~ size + (1|loc),
family = poisson,
data = df_eggs,
nAGQ = 40,
control = cntrl)
Summary of the results from the 3 methods
## PQL SE Laplace SE GHQ SE ## intercept 1.396 0.209 1.387 0.209 1.387 0.209 ## body size 0.500 0.059 0.500 0.051 0.500 0.052 ## sigma_loc 0.804 NA 0.652 NA 0.652 NA
## true values intercept = 1 body size = 0.5 sigma_loc = 0.5
What if we also want to include the random effect of year?
glmmPQL only allows for nested random effects
glmer(..., nAGQ = pts) only allows for one random effect
we can use Laplace approximation via glmer
## fit GLMM with location & year
fit_2_lap <- glmer(eggs ~ size + (1|loc) + (1|year),
family = poisson,
data = df_eggs)
## Laplace SE ## intercept 1.359 0.216 ## body size 0.551 0.053 ## sigma_loc 0.635 NA ## sigma_year 0.253 NA
## true values intercept = 1 body size = 0.5 sigma_loc = 0.5 sigma_year = 0.25
| Method | Advantages | Disadvantages | R functions |
|---|---|---|---|
| Penalized quasi-likelihood | Flexible, widely implemented | inference may be inappropriate; potentially biased | MASS::glmmPQL |
| Laplace approximation | More accurate than PQL | Slower and less flexible than PQL | glmer glmmsr::glmm glmmML::glmmML |
| Gauss-Hermite quadrature | More accurate than Laplace | Slower than Laplace; limited random effects | glmer glmmsr::glmm glmmML::glmmML |
Adapted from Bolker et al (2009)