This is the free website for Design and Analysis of Experiments and Observational Studies using R. A hardcopy of the book can be purchased from Routledge. This book grew out of course notes for a twelve-week course (one term) on the Design of Experiments and Observational Studies in the Department of Statistical Sciences at the University of Toronto. Students are senior undergraduates and applied Masters students who have completed courses in probability, mathematical statistics, and regression analysis. The purposes of the book are to expose students to the foundations of classical experimental design and design of observational studies through the framework of causality, use real data and computational tools, such as simulation, to explore these topics. The book uses R to implement designs and analyse data. It’s assumed that the reader has taken basic courses in probability, mathematical statistics, and linear models, although the essentials are reviewed briefly in the first chapter. Some experience using R is helpful although not essential. I assume that readers are familiar with standard base R and
tidyverse syntax. In the course at the University of Toronto, students are given learning resources at the beginning of the course to review these R basics, although most students have had some exposure to computing with R.
This website is free to use, and is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.
Organization of the book
The structure of each chapter presents concepts or methods followed by a section that shows readers how to implement these in R. These sections are labeled “Computational Lab: Topic”, where “Topic” is the topic that is implemented in R.
Software information and conventions
One of the unique features of this book is the emphasis on simulation and computation using R. R is wonderful because of the many open source packages available, but this can also lead to confusion about which packages to use for a task. I have tried to minimize the number of packages used in the book. The set of packages loaded on startup by default is
getOption("defaultPackages") #>  "datasets" "utils" "grDevices" "graphics" #>  "stats" "methods"
base. If a function from a non-default library is used, then this is indicated by
pkg::name instead of
This should make it clear which package a user needs to load before using a function.
Information on the R version used to write this book is below.
version #> _ #> platform x86_64-apple-darwin17.0 #> arch x86_64 #> os darwin17.0 #> system x86_64, darwin17.0 #> status #> major 4 #> minor 2.2 #> year 2022 #> month 10 #> day 31 #> svn rev 83211 #> language R #> version.string R version 4.2.2 (2022-10-31) #> nickname Innocent and Trusting
The packages used in writing this book are:
library(tidyverse) library(knitr) library(kableExtra) library(reticulate) library(janitor) library(latex2exp) library(gridExtra) library(broom) library(patchwork) library(crosstable) library(agridat) library(FrF2) library(pwr) library(emmeans) library(DiagrammeR) library(abind) library(magic) library(BsMD) library(scidesignR)
Whenever possible the R code developed in this book is written as a function instead of a series of statements. “Functions allow you to automate common tasks in a more powerful and general way than copy-and-pasting.”1 In fact, I have taken the approach that whenever I’ve copied and pasted a block of code more than twice then it’s time to write a function.
The value an R function returns is the last value evaluated.
return() can be used to return a value before the last value. Many of the functions in this book use
return() to make code easier to read even when the last value of the function is returned.
R 4.1.0 now provides a simple native forward pipe syntax
|>. The simple form of the forward pipe inserts the left-hand side as the first argument in the right-hand side call. The pipe syntax used in this book is
%>% from the
magrittr library. Most of the code in this book should work with the native pipe
|>, although this has not been thoroughly tested.
The data sets used in this book are available in the R package
scidesignR, and can be installed by running