This book presents the R materials related to the course Statistics II.
We have divided the book into the following parts:
Linear Statistical Models: This part focuses on examining variation in continuous dependent variables using a linear statistical model with a particular focus on linear regression.
Logistic Regression: This part focuses on predicting binary dependent variables using a logistic regression model.
Interactions within Linear and Logistic Models: This part focuses on incorporating interaction terms into linear and logistic regression models to examine whether the effect of one independent variable depends on the properties of another independent variable.
These three sections introduce the syntax used to perform the core methods of analysis in the class. We will break down the contents of this syntax and explain its logic when the different commands that we will use. In addition, we will often times annotate the syntax via comments (either presented in grayed out text or in bubbles that you can examine by hovering over them with your computer mouse) to further call your attention to aspects of the syntax. Here is an example:
# Packageslibrary(tidyverse) #Used for data management and plotting# A linear regression modelmodel1 <-lm(mpg ~ drat, data = mtcars)
1
We may put annotations in these little bubbles particularly if they are a little longer in nature or if we are repeating a point you have seen before.
These sections also provide general guidelines on how to present and correctly report the output from these statistical analyses. You will find the output from the statistical software as well as additional information:
Output explanation
This box contains information on the output.
Interpretation
This box contains general guidelines for the interpretation of the results, e.g. different ‘rule-of-thumb’ for the interpretation of effect sizes.
Report
✓ This box shows the general format of how you should report the result of your analysis
Warning!
This box will call your attention to things that could go wrong or otherwise pose a problem
The final part of the book provides Appendices with some supplementary information. Appendix A provides an overview of some Common Errors that you may encounter when performing these analyses and when attempting to knit your R assignment files into an html for submission. Appendix B provides an overview of the R libraries (and associated functions) that we will use in this course, the week they are introduced, and a script that will enable you to install them on your personal computer all in one go. Appendix C provides relevant formulas related to the types of analysis that we’ll examine in Statistics II. The final two appendices, meanwhile, are meant for students working on their second year Data Skills or third year BAP projects.
Statistics I Book
The contents of this book build on what you learned in Statistics I particularly when it comes to data management processes (e.g,. how to import data, filter it, summarize it, etc.). If you need a refresher on these processes, then please consult the Statistics I book. We will occasionally link to sections of particular relevance from that book in the discussions to come.
Overview per week
For each week in the course, you need to read relevant chapters …. In 2024-2025, this is: