Gavin Simpson and colleagues have produced a lot of useful teaching material on GAMs that I’ve made use of here. For example:
1 June 2026
Gavin Simpson and colleagues have produced a lot of useful teaching material on GAMs that I’ve made use of here. For example:
Understand what a smoother is and understand the motivation for using a smoother in a model
Learn how to fit some basic GAMs with splines
We seek a model that allows for some smooth and non-linear relationship between year and temperature
We seek a model that allows for some smooth and non-linear relationship between year and temperature
A generalized additive model is an extension of the GLM with a predictor that is a sum of smoothed (and possibly unsmoothed) terms:
\[ y_i = \beta_0 + \sum_j s_j(x_{j,i}) + \sum_k \beta_k x_{k,i} + \epsilon_i \]
Fit many local (weighted) regressions within a sliding window
LOESS fitting is available in the mgcv::gam() function
Pros
Pros
Cons
Splines are basis expansions
For example, a polynomial is one type of basis expansion:
\[ \begin{aligned} x^0 &= 1\\ x^1 &= x\\ x^2 &= x^2\\ x^3 &= x^3\\ &~~\vdots \end{aligned} \]
In a basis expansion
We call each of the simpler functions a basis function
The set of simpler functions, \(b_k\), is called a basis
When we model using splines, each of the \(b_k\) has a coefficient \(\beta_k\)
The spline is the sum of these functions (weighted by \(\beta_k\) and evaluated at \(x\)):
\[ s(x)=\sum_k \beta_k b_k(x) \]
One of the easiest to think about is a cubic regression spline
One of the easiest to think about is a cubic regression spline
\(X\) is divided into intervals (defined by the number of “knots”)
In each interval, we fit a cubic polynomial
\[ y_i = \beta_0 + \beta_1 x_i + \beta_2 x_{i}^2 + \beta_3 x_{i}^3 \]
We want to allow for a function that is “wiggly”, but not too wiggly
First, consider the squared second derivative:
\[ \int_{\mathbb{R}}[f'']^2dx \]
This is the rate of change in the slope - more wiggliness will result in a higher rate of change in the slope
We estimate this with:
\[ \int_{\mathbb{R}}[f'']^2dx=\beta^TS\beta \]
where \(S\) is a penalty matrix, derived from our weighted basis functions
We now use this to penalize our likelihood function
\[ log(\mathcal{L_p{|\beta}})=log(\mathcal{L{|\beta}}) - \lambda\beta^TS\beta \]
where \(\lambda\) is a smoothness parameter, indicating how much we penalize wiggliness
We will maximize this penalized likelihood
We set \(k\) as the maximum wiggliness (this determines the number of knots)
but what should we choose for the smoothness parameter \(\lambda\)?
method = "REML" uses a relationship between random effects and smoothers – we can think of \(\lambda\) as a prior on the spline coefficientsthin plate spline s(X, bs = 'tp') [default in mgcv::gam()]
thin plate spline s(X, bs = 'tp') [default in mgcv::gam()]
cubic regression spline s(X, bs = 'cr')
thin plate spline s(X, bs = 'tp') [default in mgcv::gam()]
cubic regression spline s(X, bs = 'cr')
cyclic spline s(X, bs = 'cc')
thin plate spline s(X, bs = 'tp') [default in mgcv::gam()]
cubic regression spline s(X, bs = 'cr')
cyclic spline s(X, bs = 'cc')
splines on a sphere s(X, bs = 'sos')
many others in mgcv::gam() for special situations
Let’s fit a GAM to the global temperature data using splines
ex_gam_fit <- mgcv::gam(temp ~ s(year), data = sst, method = 'REML')
## ## Family: gaussian ## Link function: identity ## ## Formula: ## temp ~ s(year) ## ## Parametric coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.056439 0.006859 -8.229 5.36e-14 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Approximate significance of smooth terms: ## edf Ref.df F p-value ## s(year) 8.515 8.933 248.8 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## R-sq.(adj) = 0.927 Deviance explained = 93.1% ## -REML = -148.15 Scale est. = 0.0082328 n = 175
## fit GAM with spline for temp
ex_gam_fit <- gam(sar ~ s(temp),
data = chinook, method = "REML")
## ## Family: gaussian ## Link function: identity ## ## Formula: ## sar ~ s(temp) ## ## Parametric coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 3.5605 0.1291 27.58 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Approximate significance of smooth terms: ## edf Ref.df F p-value ## s(temp) 2.94 3.63 3.387 0.0304 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## R-sq.(adj) = 0.273 Deviance explained = 34.4% ## -REML = 36.917 Scale est. = 0.51672 n = 31
ex_glm_fit <- glm(cbind(adults, smolts - adults) ~ temp + I(temp^2),
family = binomial(link = "logit"),
data = chinook)
Model checking
More on model selection
The flexibility of GAMs