Background
This is meant to be a helpful resource for R functions commonly used
when working with data, fitting models, and interpreting output.
Data summaries
Here are some functions that are helpful for summarizing data
contained in a vector x.
mean(x): computes the arithmetic mean (average) of
x
sd(x): computes the standard deviation of
x
var(x): computes the variance of x
quantile(x): returns estimates of underlying
distribution quantiles of x
hist(x): creates a histogram of x
Random numbers
It’s common practice to use some form of randomization when
estimating and testing models.
sample(): draws a sample of a specified size from a
group, with or without replacement
## make all of this reproducible
set.seed(514)
## a sequence from 1 to 10
xx <- seq(1, 10)
## shuffle the order
sample(xx, size = length(xx), replace = FALSE)
## [1] 6 7 9 3 10 2 1 5 8 4
## grab 3 numbers at random
sample(xx, size = 3, replace = FALSE)
## [1] 2 7 8
Data in models
Sometimes you want or need to be very explicit about how you want R
to treat some aspect of your data. For example, when used inside of a
model-fitting function, a + b will include both
a and b as predictors in a model, but
I(a + b) will include the sum of a and
b as a single predictor in a model.
I(): change the class of an object to indicate that it
should be treated ‘as is’
lm(y ~ I(x + z))
offset(): assume a model predictor has a known
coefficient of 1 rather than an unknown coefficient to be estimated
For example, if we counted 6 plants inside a 2 m2 quadrat,
we might express the density of plants as 3 plants per m2. As
you will see later, models for counts often assume a log-linear
relationship, such that
\[
\log (\text{density}) = \alpha + \beta x \\
\Downarrow \\
\log(\text{number} ~/~ \text{m}^2) = \alpha + \beta x \\
\log(\text{number}) - \log(\text{m}^2) = \alpha + \beta x \\
\log(\text{number}) = \alpha + \beta x + \log(\text{m}^2)\\
\]
We might model this relationship in R with something like
## log-linear regression
lm(log(number) ~ x + offset(log(area)))
scale(): shifts and/or scales a vector to have a mean of
0 and variance or 1
## random variate with mean of 5 and variance of 2
xx <- rnorm(100, mean = 5, sd = sqrt(2))
## scaled values with mean = 0 and variance = 1
zz <- scale(xx)
mean(zz); var(zz)
## [1] -2.60829e-16
## [,1]
## [1,] 1
Model summaries
You’ll see a variety of functions for fitting different types of
linear models in this course. The following functions are useful for
extracting information about a fitted model, such as estimated
coefficients, standard errors, and p-values.
anova(): computes analysis of variance (or deviance)
table for a fitted model object
coefficients(): extracts model coefficients from objects
returned by modeling functions
residuals(): extracts model residuals from objects
returned by modeling functions
summary(): returns a list of summary statistics of the
fitted linear model
Monte Carlo
There is a technique known as a Monte Carlo method (or
simulation or experiment), whose name comes from the Monte Carlo Casino
in Monaco. The idea is to estimate the uncertainty in a stochastic
variable or process via repeated sampling. For example, say you wanted
to estimate the probability that a coin would come up “heads” when
flipped into the air and allowed to land freely. To do so, you could
flip the coin 1000 times and after each flip record a 1 if heads or a 0
if tails. At the end count up the number of 1’s and divide the sum by
1000. Here’s a way we might do that in R.
## number of monte carlo experiments
nn <- 1000
## empty vector to record our results
rr <- rep(NA, nn)
## make this repeatable
set.seed(514)
## conduct the experiment
for(i in 1:nn) {
## flip the coin; rbinom(1, 1, 0.5) will return a 0 or 1 with probability 0.5
flip <- rbinom(1, 1, 0.5)
## record the result
rr[i] <- flip
}
## estimate probability of heads
sum(rr) / nn
## [1] 0.489
LS0tCnRpdGxlOiAiRGF0YSBhbmQgbW9kZWwgc3VtbWFyaWVzIGluIFIiCmF1dGhvcjogIjxicj5NYXJrIFNjaGV1ZXJlbGwiCmRhdGU6ICIzIEFwcmlsIDIwMjYiCm91dHB1dDoKICBodG1sX2RvY3VtZW50OgogICAgdGhlbWU6IAogICAgICBib290c3dhdGNoOiBqb3VybmFsCiAgICAgIHByaW1hcnk6ICIjMzIwMDZlIgogICAgaGlnaGxpZ2h0OiB0ZXh0bWF0ZQogICAgY3NzOiAuLi9sZWN0dXJlX2luc3QuY3NzCiAgICBjb2RlX2Rvd25sb2FkOiB0cnVlCiAgICB0b2M6IHRydWUKICAgIHRvY19mbG9hdDogdHJ1ZQogICAgdG9jX2RlcHRoOiAzCi0tLQoKYGBge3Igc2V0dXAsIGluY2x1ZGU9RkFMU0V9CmtuaXRyOjpvcHRzX2NodW5rJHNldChlY2hvID0gVFJVRSkKYGBgCgojIEJhY2tncm91bmQKClRoaXMgaXMgbWVhbnQgdG8gYmUgYSBoZWxwZnVsIHJlc291cmNlIGZvciBSIGZ1bmN0aW9ucyBjb21tb25seSB1c2VkIHdoZW4gd29ya2luZyB3aXRoIGRhdGEsIGZpdHRpbmcgbW9kZWxzLCBhbmQgaW50ZXJwcmV0aW5nIG91dHB1dC4KCioqKgoKIyBEYXRhIHN1bW1hcmllcwoKSGVyZSBhcmUgc29tZSBmdW5jdGlvbnMgdGhhdCBhcmUgaGVscGZ1bCBmb3Igc3VtbWFyaXppbmcgZGF0YSBjb250YWluZWQgaW4gYSB2ZWN0b3IgYHhgLgoKYG1lYW4oeClgOiBjb21wdXRlcyB0aGUgYXJpdGhtZXRpYyBtZWFuIChhdmVyYWdlKSBvZiBgeGAKCmBzZCh4KWA6IGNvbXB1dGVzIHRoZSBzdGFuZGFyZCBkZXZpYXRpb24gb2YgYHhgCgpgdmFyKHgpYDogY29tcHV0ZXMgdGhlIHZhcmlhbmNlIG9mIGB4YAoKYHF1YW50aWxlKHgpYDogcmV0dXJucyBlc3RpbWF0ZXMgb2YgdW5kZXJseWluZyBkaXN0cmlidXRpb24gcXVhbnRpbGVzIG9mIGB4YAoKYGhpc3QoeClgOiBjcmVhdGVzIGEgaGlzdG9ncmFtIG9mIGB4YAoKCioqKgoKIyBSYW5kb20gbnVtYmVycwoKSXQncyBjb21tb24gcHJhY3RpY2UgdG8gdXNlIHNvbWUgZm9ybSBvZiByYW5kb21pemF0aW9uIHdoZW4gZXN0aW1hdGluZyBhbmQgdGVzdGluZyBtb2RlbHMuCgpgc2FtcGxlKClgOiBkcmF3cyBhIHNhbXBsZSBvZiBhIHNwZWNpZmllZCBzaXplIGZyb20gYSBncm91cCwgd2l0aCBvciB3aXRob3V0IHJlcGxhY2VtZW50CgpgYGB7ciBleF9zYW1wbGV9CiMjIG1ha2UgYWxsIG9mIHRoaXMgcmVwcm9kdWNpYmxlCnNldC5zZWVkKDUxNCkKCiMjIGEgc2VxdWVuY2UgZnJvbSAxIHRvIDEwCnh4IDwtIHNlcSgxLCAxMCkKCiMjIHNodWZmbGUgdGhlIG9yZGVyCnNhbXBsZSh4eCwgc2l6ZSA9IGxlbmd0aCh4eCksIHJlcGxhY2UgPSBGQUxTRSkKCiMjIGdyYWIgMyBudW1iZXJzIGF0IHJhbmRvbQpzYW1wbGUoeHgsIHNpemUgPSAzLCByZXBsYWNlID0gRkFMU0UpCmBgYAoKIyMgRGlzdHJpYnV0aW9uYWwgZm9ybXMKClRoZSBmdW5jdGlvbnMgZm9yIGdlbmVyYXRpbmcgcmFuZG9tIGRhdGEgZnJvbSBhIHN0YXRpc3RpYyBkaXN0cmlidXRpb24gaW4gUiBhbGwgc3RhcnQgd2l0aCBgcmAgKGZvciAicmFuZG9tIikuCgojIyMgQ29udGludW91cyB2YWx1ZXMKCkNvbnRpbnVvdXMgdmFsdWVzIGNhbiBvY2N1ciBhbnl3aGVyZSBvbiB0aGUgZW50aXJlIHJlYWwgbnVtYmVyIGxpbmUgZnJvbSAtJmluZmluOyB0byAmaW5maW47LCBidXQgc29tZSBoYXZlIGEgbmVjZXNzYXJ5IGxvd2VyIGJvdW5kIGF0IHplcm8gKGUuZy4sIGRlbnNpdGllcywgY29uY2VudHJhdGlvbnMpLgoKIyMjIyBOb3JtYWwKCmBybm9ybSgpYDogZHJhdyByYW5kb20gc2FtcGxlcyBmcm9tIGEgbm9ybWFsIGRpc3RyaWJ1dGlvbgoKYGBge3IsIGV4X3Jub3JtfQojIyAxMDAwIHZhbHVlcyBmcm9tIGEgbm9ybWFsIGRpc3RyaWJ1dGlvbiB3aXRoIG1lYW4gPSAyNSBhbmQgdmFyaWFuY2UgPSA1Cnh4IDwtIHJub3JtKG4gPSAxMDAwLCBtZWFuID0gMjUsIHNkID0gc3FydCg1KSkKCiMjIGVzdGltYXRlZCBtZWFuIGFuZCB2YXJpYW5jZQptZWFuKHh4KTsgdmFyKHh4KQoKIyMgaGlzdG9ncmFtIG9mIHZhbHVlcwpoaXN0KHh4LCBtYWluID0gIiIsCiAgICAgYnJlYWtzID0gc2VxKGZsb29yKG1pbih4eCkpLCBjZWlsaW5nKG1heCh4eCkpKSkKYGBgCgo8YnI+CgojIyMjIExvZ2lzdGljCgpgcmxvZ2lzKClgOiBkcmF3IHJhbmRvbSBzYW1wbGVzIGZyb20gYSBsb2dpc3RpYyBkaXN0cmlidXRpb24KCiMjIyMgVW5pZm9ybQoKYHJ1bmlmKClgOiBkcmF3IHJhbmRvbSBzYW1wbGVzIGZyb20gYSB1bmlmb3JtIGRpc3RyaWJ1dGlvbgoKPGJyPgoKIyMjIERpc2NyZXRlIHZhbHVlcwoKRGlzY3JldGUgdmFsdWVzIGFyZSBpbnRlZ2VycyB0aGF0IHR5cGljYWxseSBvbmx5IHRha2Ugb24gbm9uLW5lZ2F0aXZlIHZhbHVlcyAoZS5nLiwgcHJlc2VuY2UvYWJzZW5jZSwgY291bnRzKS4KCiMjIyMgQmVybm91bGxpICYgQmlub21pYWwKCmByYmlub20oKWA6IGRyYXcgcmFuZG9tIHNhbXBsZXMgZnJvbSBhIGJpbm9taWFsIGRpc3RyaWJ1dGlvbjsgY2FuIGJlIHVzZWQgZm9yIEJlcm5vdWxsaSBkaXN0cmlidXRpb24gYXMgd2VsbAoKPHU+U2ltdWxhdGVkICJjb2luIHRvc3MiIHZpYSBCZXJub3VsbGk8L3U+CgpgYGB7ciBleF9yYmVybn0KIyMgZmxpcCBhIGNvaW4gMTAwMCB0aW1lcwp4eCA8LSByYmlub20obiA9IDEwMDAsIHNpemUgPSAxLCBwcm9iID0gMC41KQoKIyMgY291bnQgdGFpbHMgKDApIGFuZCBoZWFkcyAoMSkKdGFibGUoeHgpCmBgYAoKPHU+U2ltdWxhdGVkIHN1cnZpdmFsPC91PgoKYGBge3IgZXhfcmJpbm9tfQojIyAxMzU3IHRhZ2dlZCBzbW9sdHM7IHByb2JhYmlsaXR5IG9mIHN1cnZpdmFsIGlzIDAuMDUKeHggPC0gcmJpbm9tKG4gPSAxMzU3LCBzaXplID0gMSwgcHJvYiA9IDAuMDUpCgojIyBudW1iZXIgb2YgbW9ydGFsaXRpZXMgKDApIGFuZCBzdXJ2aXZvcnMgKDEpCnRhYmxlKHh4KSAgIyBwIH49IDAuMDUyOApgYGAKCiMjIyMgUG9pc3NvbgoKYHJwb2lzKClgOiBkcmF3IHJhbmRvbSBzYW1wbGVzIGZyb20gYSBQb2lzc29uIGRpc3RyaWJ1dGlvbgoKYGBge3IsIGV4X3Jwb2lzfQojIyAxMDAwIHZhbHVlcyBmcm9tIGEgUG9pc3NvbiBkaXN0cmlidXRpb24gd2l0aCBtZWFuID0gdmFyaWFuY2UgPSAyNQp4eCA8LSBycG9pcyhuID0gMTAwMCwgbGFtYmRhID0gMjUpCgojIyBlc3RpbWF0ZWQgbWVhbiBhbmQgdmFyaWFuY2UKbWVhbih4eCk7IHZhcih4eCkKCiMjIGhpc3RvZ3JhbSBvZiB2YWx1ZXMKaGlzdCh4eCwgbWFpbiA9ICIiLAogICAgIGJyZWFrcyA9IHNlcShmbG9vcihtaW4oeHgpKSwgY2VpbGluZyhtYXgoeHgpKSkpCmBgYAoKPGJyPgoKIyMjIyBOZWdhdGl2ZSBiaW5vbWlhbAoKYHJuYmlub20oKWA6IGRyYXcgcmFuZG9tIHNhbXBsZXMgZnJvbSBhIG5lZ2F0aXZlIGJpbm9taWFsIGRpc3RyaWJ1dGlvbgoKYGBge3IsIGV4X3JuYmlub219CiMjIDEwMDAgdmFsdWVzIGZyb20gYSBuZWdhdGl2ZSBiaW5vbWlhbCBkaXN0cmlidXRpb24gd2l0aCBtZWFuID0gMjUgYW5kIHZhcmlhbmNlID0gNTAKeHggPC0gcm5iaW5vbShuID0gMTAwMCwgbXUgPSAyNSwgc2l6ZSA9IDI1KQoKIyMgZXN0aW1hdGVkIG1lYW4gYW5kIHZhcmlhbmNlCm1lYW4oeHgpOyB2YXIoeHgpCgojIyBoaXN0b2dyYW0gb2YgdmFsdWVzCmhpc3QoeHgsIG1haW4gPSAiIiwKICAgICBicmVha3MgPSBzZXEoZmxvb3IobWluKHh4KSksIGNlaWxpbmcobWF4KHh4KSkpKQpgYGAKCgoqKioKCiMgRGF0YSBpbiBtb2RlbHMKClNvbWV0aW1lcyB5b3Ugd2FudCBvciBuZWVkIHRvIGJlIHZlcnkgZXhwbGljaXQgYWJvdXQgaG93IHlvdSB3YW50IFIgdG8gdHJlYXQgc29tZSBhc3BlY3Qgb2YgeW91ciBkYXRhLiBGb3IgZXhhbXBsZSwgd2hlbiB1c2VkIGluc2lkZSBvZiBhIG1vZGVsLWZpdHRpbmcgZnVuY3Rpb24sIGBhICsgYmAgd2lsbCBpbmNsdWRlIGJvdGggYGFgIGFuZCBgYmAgYXMgcHJlZGljdG9ycyBpbiBhIG1vZGVsLCBidXQgYEkoYSArIGIpYCB3aWxsIGluY2x1ZGUgdGhlIHN1bSBvZiBgYWAgYW5kIGBiYCBhcyBhIHNpbmdsZSBwcmVkaWN0b3IgaW4gYSBtb2RlbC4KCmBJKClgOiBjaGFuZ2UgdGhlIGNsYXNzIG9mIGFuIG9iamVjdCB0byBpbmRpY2F0ZSB0aGF0IGl0IHNob3VsZCBiZSB0cmVhdGVkIOKAmGFzIGlz4oCZCgpgYGB7ciBleF9JLCBldmFsID0gRkFMU0V9CmxtKHkgfiBJKHggKyB6KSkKYGBgCgpgb2Zmc2V0KClgOiBhc3N1bWUgYSBtb2RlbCBwcmVkaWN0b3IgaGFzIGEga25vd24gY29lZmZpY2llbnQgb2YgMSByYXRoZXIgdGhhbiBhbiB1bmtub3duIGNvZWZmaWNpZW50IHRvIGJlIGVzdGltYXRlZAoKRm9yIGV4YW1wbGUsIGlmIHdlIGNvdW50ZWQgNiBwbGFudHMgaW5zaWRlIGEgMiBtPHN1cD4yPC9zdXA+IHF1YWRyYXQsIHdlIG1pZ2h0IGV4cHJlc3MgdGhlIGRlbnNpdHkgb2YgcGxhbnRzIGFzIDMgcGxhbnRzIHBlciBtPHN1cD4yPC9zdXA+LiBBcyB5b3Ugd2lsbCBzZWUgbGF0ZXIsIG1vZGVscyBmb3IgY291bnRzIG9mdGVuIGFzc3VtZSBhIGxvZy1saW5lYXIgcmVsYXRpb25zaGlwLCBzdWNoIHRoYXQgCgokJApcbG9nIChcdGV4dHtkZW5zaXR5fSkgPSBcYWxwaGEgKyBcYmV0YSB4IFxcClxEb3duYXJyb3cgXFwKXGxvZyhcdGV4dHtudW1iZXJ9IH4vfiBcdGV4dHttfV4yKSA9IFxhbHBoYSArIFxiZXRhIHggXFwKXGxvZyhcdGV4dHtudW1iZXJ9KSAtIFxsb2coXHRleHR7bX1eMikgPSBcYWxwaGEgKyBcYmV0YSB4IFxcClxsb2coXHRleHR7bnVtYmVyfSkgPSBcYWxwaGEgKyBcYmV0YSB4ICsgXGxvZyhcdGV4dHttfV4yKVxcCiQkCgpXZSBtaWdodCBtb2RlbCB0aGlzIHJlbGF0aW9uc2hpcCBpbiBSIHdpdGggc29tZXRoaW5nIGxpa2UKCmBgYHtyIGV4X29mZnNldCwgZXZhbCA9IEZBTFNFfQojIyBsb2ctbGluZWFyIHJlZ3Jlc3Npb24KbG0obG9nKG51bWJlcikgfiB4ICsgb2Zmc2V0KGxvZyhhcmVhKSkpCmBgYAoKYHNjYWxlKClgOiBzaGlmdHMgYW5kL29yIHNjYWxlcyBhIHZlY3RvciB0byBoYXZlIGEgbWVhbiBvZiAwIGFuZCB2YXJpYW5jZSBvciAxCgpgYGB7cn0KIyMgcmFuZG9tIHZhcmlhdGUgd2l0aCBtZWFuIG9mIDUgYW5kIHZhcmlhbmNlIG9mIDIKeHggPC0gcm5vcm0oMTAwLCBtZWFuID0gNSwgc2QgPSBzcXJ0KDIpKQoKIyMgc2NhbGVkIHZhbHVlcyB3aXRoIG1lYW4gPSAwIGFuZCB2YXJpYW5jZSA9IDEKenogPC0gc2NhbGUoeHgpCm1lYW4oenopOyB2YXIoenopCmBgYAoKKioqCgojIE1vZGVsIHN1bW1hcmllcwoKWW91J2xsIHNlZSBhIHZhcmlldHkgb2YgZnVuY3Rpb25zIGZvciBmaXR0aW5nIGRpZmZlcmVudCB0eXBlcyBvZiBsaW5lYXIgbW9kZWxzIGluIHRoaXMgY291cnNlLiBUaGUgZm9sbG93aW5nIGZ1bmN0aW9ucyBhcmUgdXNlZnVsIGZvciBleHRyYWN0aW5nIGluZm9ybWF0aW9uIGFib3V0IGEgZml0dGVkIG1vZGVsLCBzdWNoIGFzIGVzdGltYXRlZCBjb2VmZmljaWVudHMsIHN0YW5kYXJkIGVycm9ycywgYW5kIHAtdmFsdWVzLgoKYGFub3ZhKClgOiBjb21wdXRlcyBhbmFseXNpcyBvZiB2YXJpYW5jZSAob3IgZGV2aWFuY2UpIHRhYmxlIGZvciBhIGZpdHRlZCBtb2RlbCBvYmplY3QKCmBjb2VmZmljaWVudHMoKWA6IGV4dHJhY3RzIG1vZGVsIGNvZWZmaWNpZW50cyBmcm9tIG9iamVjdHMgcmV0dXJuZWQgYnkgbW9kZWxpbmcgZnVuY3Rpb25zCgpgcmVzaWR1YWxzKClgOiBleHRyYWN0cyBtb2RlbCByZXNpZHVhbHMgZnJvbSBvYmplY3RzIHJldHVybmVkIGJ5IG1vZGVsaW5nIGZ1bmN0aW9ucwoKYHN1bW1hcnkoKWA6IHJldHVybnMgYSBsaXN0IG9mIHN1bW1hcnkgc3RhdGlzdGljcyBvZiB0aGUgZml0dGVkIGxpbmVhciBtb2RlbAoKCioqKgoKIyBNb250ZSBDYXJsbwoKVGhlcmUgaXMgYSB0ZWNobmlxdWUga25vd24gYXMgYSBfTW9udGUgQ2FybG9fIG1ldGhvZCAob3Igc2ltdWxhdGlvbiBvciBleHBlcmltZW50KSwgd2hvc2UgbmFtZSBjb21lcyBmcm9tIHRoZSBNb250ZSBDYXJsbyBDYXNpbm8gaW4gTW9uYWNvLiBUaGUgaWRlYSBpcyB0byBlc3RpbWF0ZSB0aGUgdW5jZXJ0YWludHkgaW4gYSBzdG9jaGFzdGljIHZhcmlhYmxlIG9yIHByb2Nlc3MgdmlhIHJlcGVhdGVkIHNhbXBsaW5nLiBGb3IgZXhhbXBsZSwgc2F5IHlvdSB3YW50ZWQgdG8gZXN0aW1hdGUgdGhlIHByb2JhYmlsaXR5IHRoYXQgYSBjb2luIHdvdWxkIGNvbWUgdXAgImhlYWRzIiB3aGVuIGZsaXBwZWQgaW50byB0aGUgYWlyIGFuZCBhbGxvd2VkIHRvIGxhbmQgZnJlZWx5LiBUbyBkbyBzbywgeW91IGNvdWxkIGZsaXAgdGhlIGNvaW4gMTAwMCB0aW1lcyBhbmQgYWZ0ZXIgZWFjaCBmbGlwIHJlY29yZCBhIDEgaWYgaGVhZHMgb3IgYSAwIGlmIHRhaWxzLiBBdCB0aGUgZW5kIGNvdW50IHVwIHRoZSBudW1iZXIgb2YgMSdzIGFuZCBkaXZpZGUgdGhlIHN1bSBieSAxMDAwLiBIZXJlJ3MgYSB3YXkgd2UgbWlnaHQgZG8gdGhhdCBpbiBSLgoKYGBge3IgbW9udGVfY2FybG9fZXh9CiMjIG51bWJlciBvZiBtb250ZSBjYXJsbyBleHBlcmltZW50cwpubiA8LSAxMDAwCgojIyBlbXB0eSB2ZWN0b3IgdG8gcmVjb3JkIG91ciByZXN1bHRzCnJyIDwtIHJlcChOQSwgbm4pCgojIyBtYWtlIHRoaXMgcmVwZWF0YWJsZQpzZXQuc2VlZCg1MTQpCgojIyBjb25kdWN0IHRoZSBleHBlcmltZW50CmZvcihpIGluIDE6bm4pIHsKICAjIyBmbGlwIHRoZSBjb2luOyByYmlub20oMSwgMSwgMC41KSB3aWxsIHJldHVybiBhIDAgb3IgMSB3aXRoIHByb2JhYmlsaXR5IDAuNQogIGZsaXAgPC0gcmJpbm9tKDEsIDEsIDAuNSkKICAjIyByZWNvcmQgdGhlIHJlc3VsdAogIHJyW2ldIDwtIGZsaXAKfQoKIyMgZXN0aW1hdGUgcHJvYmFiaWxpdHkgb2YgaGVhZHMKc3VtKHJyKSAvIG5uCmBgYAoKCgo=