Statistical analysis and presentation using R

An overview of R functions used in Statistics II

Authors

Leila Demarest

Joshua Robison

Published

September 30, 2025

Preface

This book presents the R materials related to the course Statistics II.

We have divided the book into the following parts:

Linear Statistical Models: This part focuses on examining variation in continuous dependent variables using a linear statistical model with a particular focus on linear regression.
Logistic Regression: This part focuses on predicting binary dependent variables using a logistic regression model.
Interactions within Linear and Logistic Models: This part focuses on incorporating interaction terms into linear and logistic regression models to examine whether the effect of one independent variable depends on the properties of another independent variable.

These three sections introduce the syntax used to perform the core methods of analysis in the class. We will break down the contents of this syntax and explain its logic when the different commands that we will use. In addition, we will often times annotate the syntax via comments (either presented in grayed out text or in bubbles that you can examine by hovering over them with your computer mouse) to further call your attention to aspects of the syntax. Here is an example:

# Packages
library(tidyverse)   #Used for data management and plotting

# A linear regression model
model1 <- lm(mpg ~ drat, data = mtcars)

1: We may put annotations in these little bubbles particularly if they are a little longer in nature or if we are repeating a point you have seen before.

These sections also provide general guidelines on how to present and correctly report the output from these statistical analyses. You will find the output from the statistical software as well as additional information:

Output explanation

This box contains information on the output.

Interpretation

This box contains general guidelines for the interpretation of the results, e.g. different ‘rule-of-thumb’ for the interpretation of effect sizes.

Report

✓ This box shows the general format of how you should report the result of your analysis

Warning!

This box will call your attention to things that could go wrong or otherwise pose a problem

The final part of the book provides Appendices with some supplementary information. Appendix A provides an overview of some Common Errors that you may encounter when performing these analyses and when attempting to knit your R assignment files into an html for submission. Appendix B provides an overview of the R libraries (and associated functions) that we will use in this course, the week they are introduced, and a script that will enable you to install them on your personal computer all in one go. Appendix C provides relevant formulas related to the types of analysis that we’ll examine in Statistics II. The final two appendices, meanwhile, are meant for students working on their second year Data Skills or third year BAP projects.

Statistics I Book

The contents of this book build on what you learned in Statistics I particularly when it comes to data management processes (e.g,. how to import data, filter it, summarize it, etc.). If you need a refresher on these processes, then please consult the Statistics I book. We will occasionally link to sections of particular relevance from that book in the discussions to come.

Overview per week

For each week in the course, you need to read relevant chapters …. In 2024-2025, this is:

Week	Section	Chapters
1	Linear Models	1 Investigating Relationships between Continuous Variables ; 8 Reporting and Presenting Results (8.2 & 8.3)
2	Linear Models	2 Bivariate Regression with Binary & Categorical Predictors ; 3 Statistical Significance ; 5 Predicted & Residual Values (5.1 & 5.2) ; 8 Reporting and Presenting Results (8.4)
3	Linear Models	4 Multiple Linear Regression ; 5 Predicted & Residual Values (5.3) ; 6 Model Fit ; 8 Reporting and Presenting Results (8.4 - 8.7)
4	Linear Models	7 OLS Assumptions
5	Logistic Regression	9 Logistic Regression & Odds Ratios ; 10 Marginal Effects ; 11 Predicted Probabilities ; 14 Reporting & Presenting Logistic Regressions
6	Logistic Regression	12 Model Fit and Comparisons ; 13 Logistic Regression Assumptions
7	Interactions	15 Including an Interaction Term in a Regression Model ; 16 Marginal Effects in Interaction Models ; 17 Predicted Values from Interaction Models