Fitting Generalized Additive Models

3 June 2026

Acknowledgments

Gavin Simpson and colleagues have produced a lot of useful teaching material on GAMs that I’ve made use of here. For example:

Gavin’s blog

Webinar 1

Webinar 2

Goals for today

Understand some diagnostics
Understand more about model selection techniques with GAMs
Become familiar with some of the flexibility of GAMs

Last time

A generalized additive model is an extension of the GLM with a predictor that is a sum of smoothed (and possibly unsmoothed) terms:

\[ y_i = \beta_0 + \sum_j s_j(x_{j,i}) + \sum_k \beta_k x_{k,i} + \epsilon_i \]

And with splines, our smoothed terms are a sum of weighted basis functions:

\[ s(x)=\sum_k \beta_k b_k(x) \]

Choosing a model

Ask yourself:

data/error distribution?
predictors?
type of smoother(s)?

Check candidate models via diagnostics

Effect of choice of \(k\) (number of knots)?

An example

Chicago air quality and death rate (Peng and Welty (2004); {NMMAPSdata})

death: total deaths per day

pm25median: median particles < 2.5 mg per cubic m

o3median: ozone in parts per billion

tmpd: temperature in Fahrenheit

## get data
data(chicago, package = "gamair")

## fit gam
fit_gam_air <- mgcv::gam(death ~ s(pm25median) + 
                           s(o3median) + 
                           s(tmpd), 
                         family = "poisson", 
                         data = chicago, method = "REML")

An example

Smoothed Terms

##                    edf   Ref.df     Chi.sq      p-value
## s(pm25median) 3.518573 4.440907  20.860574 0.0006254809
## s(o3median)   3.430259 4.305665   6.373377 0.2230510318
## s(tmpd)       4.801666 5.878771 244.698414 0.0000000000

GAM interpretation

Effective degrees of freedom (edf) - degree of non-linearity, but with upper bound set by \(k\)
Reference degrees of freedom (Ref.df) - used for calculating test statistic
p-value - null hypothesis is a linear effect (this can include a flat linear effect, i.e., no effect)
- Biased low because they don’t account for need to estimate \(\lambda\)
- Works well asymptotically
- Doesn’t work for unpenalized smooths (random effects)

Model checking

Effect of \(k\) on edf (for median particles <2.5 mg per \(m^3\))

Model checking

Check whether \(k\) is adequate - can set it very large to start, but this increases computation time
Do some iterative checking with larger values of \(k\) for multiple parameters, effective degrees of freedom should stabilize

## check model fit
gam.check(fit_gam_air)

Model checking

\(k\) set to default for each variable

## 
## Method: REML   Optimizer: outer newton
## full convergence after 4 iterations.
## Gradient range [-7.763319e-07,1.839545e-08]
## (score 2867.802 & scale 1).
## Hessian positive definite, eigenvalue range [0.2056908,1.115369].
## Model rank =  28 / 28 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                 k'  edf k-index p-value  
## s(pm25median) 9.00 3.52    0.99    0.38  
## s(o3median)   9.00 3.43    0.97    0.19  
## s(tmpd)       9.00 4.80    0.94    0.06 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model checking

Again but with \(k = 20\) for all \(x\)

## fit with k = 20 for all
fit_gam_air <- mgcv::gam(death ~ s(pm25median, k = 20) + 
                           s(o3median, k = 20) + 
                           s(tmpd, k = 20), 
                         family = "poisson", 
                         data = chicago, method = "REML")

## check p-values
gam.check(fit_gam_air)

Model checking

\(p\)-value above \(\alpha\) and edf is not close to \(k\)

## 
## Method: REML   Optimizer: outer newton
## full convergence after 3 iterations.
## Gradient range [-1.827205e-06,5.041726e-07]
## (score 2867.819 & scale 1).
## Hessian positive definite, eigenvalue range [0.1852984,1.155305].
## Model rank =  58 / 58 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                  k'   edf k-index p-value  
## s(pm25median) 19.00  3.57    0.99   0.350  
## s(o3median)   19.00  3.59    0.97   0.215  
## s(tmpd)       19.00  5.01    0.94   0.075 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Other model diagnostics

`gam.check()` provides some basic diagnostic plots

## 
## Method: REML   Optimizer: outer newton
## full convergence after 3 iterations.
## Gradient range [-1.827205e-06,5.041726e-07]
## (score 2867.819 & scale 1).
## Hessian positive definite, eigenvalue range [0.1852984,1.155305].
## Model rank =  58 / 58 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                  k'   edf k-index p-value  
## s(pm25median) 19.00  3.57    0.99    0.32  
## s(o3median)   19.00  3.59    0.97    0.24  
## s(tmpd)       19.00  5.01    0.94    0.06 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model selection

When we select \(\lambda\) we are only penalizing wiggliness

However, this only penalizes functions of \(X\) with a second derivative > 0

Thus, linear relationships can’t be removed from the model this way

Model selection

Use select = TRUE to apply penalties to the linear terms as well

## apply penalties to linear terms 
fit_gam_air <- mgcv::gam(death ~ s(pm25median, k = 20) + 
                           s(o3median, k = 20) + 
                           s(tmpd, k = 20), 
                         select = TRUE,
                         family = "poisson",
                         data = chicago, method = "REML")

Model selection

Check to see if edf < 1

## check edf values
summary(fit_gam_air)$s.table

##                    edf   Ref.df     Chi.sq     p-value
## s(pm25median) 3.567278 4.543784  20.830454 0.000704636
## s(o3median)   3.591463 4.540236   7.006517 0.201616686
## s(tmpd)       5.006420 6.297518 243.866273 0.000000000

Model selection

Use `gratia::draw()` to create partial effect plots for the smooths

Flexibility of GAMs

We have lots of options when fitting GAMs

so we have to give very careful thought to our goal(s)

Flexibility of GAMs

We have lots of options when fitting GAMs

so we have to give very careful thought to our goal(s)

Just because we can fit a GAM doesn’t mean it will be interpretable or useful!

Another example

Atmospheric CO₂ concentrations at the South Pole (1957 to 1999)

Candidate model

Cubic regression basis because have a lot of wiggliness in the data

Let’s make \(k\) really large then

## fit GAM with cubic regression basis
fit_co2_cubic <- gam(co2 ~ s(c.month, k = 300, bs = "cr"),
                     method = 'REML',
                     data = co2s)

Model summary

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## co2 ~ s(c.month, k = 300, bs = "cr")
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.382e+02  4.444e-03   76111   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F p-value    
## s(c.month) 205.3  241.4 44646  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =      1   Deviance explained =  100%
## -REML = 36.492  Scale est. = 0.0084334  n = 427

Predicting the future

Not a great model

Another candidate model

Let’s now distinguish within-year variation from the long-term trend

cyclic spline on month (1:12)
cubic regression spline on c.month (1:507)

## fit GAM with 
b2 <- gam(co2 ~ s(month, bs = "cc") + 
            s(c.month, bs = "cr", k = 300), 
          knots = list(month = c(0.5, 12.5)),
          method = 'REML',
          data = co2s)

Cyclic & cubic splines

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## co2 ~ s(month, bs = "cc") + s(c.month, bs = "cr", k = 300)
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.382e+02  5.331e-03   63447   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df       F p-value    
## s(month)     7.266    8.0   165.1  <2e-16 ***
## s(c.month) 107.309  131.8 56450.2  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =      1   Deviance explained =  100%
## -REML = -103.74  Scale est. = 0.012136  n = 427

Cyclic & cubic splines

Much more reasonable predictions

A cautionary note

Building GAMs requires more choices than building GLMs

It requires thinking hard about the system and the question

Options for interactions

main effects & interactions te(x1, x2)

separate smoothness parameter and separate marginal basis

Options for interactions

main effects & interactions te(x1, x2)

separate smoothness parameter and separate marginal basis

interactions only ti(x1, x2)

when the main effect are specified elsewhere

Options for interactions

main effects & interactions te(x1, x2)

separate smoothness parameter and separate marginal basis

interactions only ti(x1, x2)

when the main effect are specified elsewhere

bivariate thin plate splines s(x1, x2, bs = 'tp')

single smoothness parameter, only use when units are the same (eg, over space)

Options for interactions

main effects & interactions te(x1, x2)

separate smoothness parameter and separate marginal basis

interactions only ti(x1, x2)

when the main effect are specified elsewhere

bivariate thin plate splines s(x1, x2, bs = 'tp')

single smoothness parameter, only use when units are the same (eg, over space)

smooth by different levels of a factor f + s(x1, by = f)

be sure to include the factor as a fixed effect
akin to random regression slopes

Mixed models

GAMMs

We can also fit random effects

mgcv::gam(...,bs = “re”)`
mgcv::gamm()
gamm4::gamm4()

Thus, these models are equivalent:

lme(y ~ 1 + (1|f), method = "REML")
gam(y ~ s(f, bs = "re"), method = "REML")

Summary

GAMs using splines offer tremendous flexibility

There is some tradeoff between flexibility and ease of understanding

Many of our relationships in ecology aren’t linear

What are some examples?

Acknowledgments

Goals for today

Last time

Choosing a model

An example

An example

GAM interpretation

Model checking

Effect of \(k\) on edf (for median particles <2.5 mg per \(m^3\))

Model checking

Model checking

\(k\) set to default for each variable

Model checking

Again but with \(k = 20\) for all \(x\)

Model checking

\(p\)-value above \(\alpha\) and edf is not close to \(k\)

Other model diagnostics

gam.check() provides some basic diagnostic plots

Model selection

Model selection

Model selection

Model selection

Use gratia::draw() to create partial effect plots for the smooths

Flexibility of GAMs

Flexibility of GAMs

Another example

Atmospheric CO2 concentrations at the South Pole (1957 to 1999)

Candidate model

Model summary

Predicting the future

Not a great model

Another candidate model

Cyclic & cubic splines

Cyclic & cubic splines

Much more reasonable predictions

A cautionary note

Options for interactions

Options for interactions

Options for interactions

Options for interactions

Mixed models

GAMMs

Summary

`gam.check()` provides some basic diagnostic plots

Use `gratia::draw()` to create partial effect plots for the smooths

Atmospheric CO₂ concentrations at the South Pole (1957 to 1999)