Preface

We hope readers will take away three ideas from this book in addition to forming a foundation of statistical thinking and methods.

  1. Statistics is an applied field with a wide range of practical applications.
  2. You don’t have to be a math guru to learn from interesting, real data.
  3. Data are messy, and statistical tools are imperfect. However, when you understand the strengths and weaknesses of these tools, you can use them to learn interesting things about the~world.

Textbook overview

  1. Introduction to data. Data structures, variables, summaries, graphics, and basic data collection techniques.
  2. Exploratory data analysis. Data visualization and summarisation.
  3. Correlation and regression. Visualising relationships between many variables and descriptive summaries for quantifying the relationship between two variables.
  4. Multiple regression. Descriptive summaries for quantifying the relationship between two variables.
  5. Foundations for inference. Case studies are used to introduce the ideas of statistical inference with randomization and simulations.
  6. Inference for categorical data. Inference for proportions using simulation and randomization techniques as well as the normal and chi-square distributions.
  7. Inference for numerical data. Inference for one or two sample means using simulation and randomization techniques as well as the normal and F distributions.
  8. Inference for regression. Extending inference techniques presented thus-far to regression settings.

Examples and exercises

Examples are provided to establish an understanding of how to apply methods.

This is an example. When a question is asked here, where can the answer be found?


The answer can be found here, in the solution section of the example!

When we think the reader should be ready to try determining the solution to an example, we frame it as Guided Practice.

The reader may check or learn the answer to any Guided Practice problem by reviewing the full solution in a footnote.1

Exercises are also provided at the end of each section as well as review exercises at the end of each chapter.

Datasets

A large majority of the datasets used in the book can be found in various R packages. Each time a new dataset is introduced in the narrative, a reference to the package like the one below is provided. Many of these datasets are in the openintro package that contains data sets used in OpenIntro’s open-source textbooks.2

These data can be found in the openintro package: textbooks.

Computing with R

The narrative and the exercises in the book are computing language agnostic, however while it’s possible to learn about modern statistics without computing, it’s not possible to apply it. Therefore, we invite you to navigate the concepts you have learned in each chapter using the interactive R tutorials and the R labs that are included at the end of each chapter.

Interactive R tutorials

These are self-paced tutorials developed using the learnr package and you only need a browser to complete them.

Each chapter comes with a tutorial which is listed like this, which is comprised of 3-5 lessons.

Each of these lessons…

… are listed like this.

You can also access the full list of tutorials supporting this book here.

R labs

Once you feel comfortable with the material in these tutorials, we also encourage you to apply what you’ve learned via the computational labs that are also linked at the end of each chapter. These are data analysis case studies and they require access to R and RStudio. The first lab includes installation instructions. If you’d rather not install the software, you can also try RStudio Cloud for free.

Labs for each chapter are listed like this.

And you can find the full list of labs supporting this textbook here.

OpenIntro, online resources, and getting involved

OpenIntro is an organization focused on developing free and affordable education materials.

We encourage anyone learning or teaching statistics to visit openintro.org and get involved.

All OpenIntro resources are free and anyone is welcomed to use these online tools and resources with or without this textbook as a companion.

We value your feedback. If there is a part of the project you especially like or think needs improvement, we want to hear from you. For feedback on this specific book, you can open an issue on GitHub. You can also provide feedback on this book or any other OpenIntro resource via our contact form at openintro.org.

Acknowledgements

The OpenIntro project would not have been possible without the dedication and volunteer hours of all those involved. No one has received any monetary compensation from this project, and we hope you will join us in extending a huge thank you to all those who volunteer with OpenIntro.

The authors would like to thank

  • David Diez and Christopher Barr for their work on the 1st Edition of this book,
  • Ben Baumer and Andrew Bray for their contribution to the original vision for the revamp for the 2nd Edition as well as their work as original authors of the interactive tutorial content,
  • Yanina Bellini Saibene, Florencia D’Andrea, and Roxana Noelia Villafañe for their work on creating the interactive tutorials in learnr,
  • Will Gray for conceptual diagrams,
  • Allison Theobold and Randy Prium for their valuable feedback during the development of the book,
  • Colin Rundel for technical help with conversion from LaTeX to R Markdown, and
  • Müge Çetinkaya and Meenal Patel for their design vision.

We would like to also thank the developers of the open-source tools that make the development and authoring of this book possible, e.g. bookdown, tidyverse, and icons8.

We are also grateful to the many teachers, students, and other readers who have helped improve OpenIntro resources through their feedback.