Basics of Statistics and R
Preface - Aim and Scope
This page contains …
… a quick overview of what this book is and what it is intended for, as well as how to use it.
What is this?
This is not an exhaustive book featuring deep dives into the history or application of specific statistics concepts, ideas and methods. It is also not meant to be a textbook accompanying a university course - although I might use it in this way in the future.
Instead, this is meant to be a collection of small texts and examples serving as primers on core statistical concepts, methods and workflows. The aim here is to present the basic ideas (as well as their execution in R) and explain the reasons behind these ideas, in order to give you a general understanding as to what is happening. I will also point you in the right direction if you want to dive deeper.
Aren’t there many resources like this already?
Yes, and no. Although a lot of books have been written about the core ideas behind statistical methods, and many web resources about the application of these methods in any given statistical program exist, these suffer from some major drawbacks:
Books are long and nobody wants to read anymore. I hope that my way of explaining stuff as concise as possible, as well as my way of writing will resonate in a way that many “dry” statistics textbooks simply can’t.
While Andy Field’s (2012) is still a great book on the subject, it was published in 2012 and much of the code used is unfortunately wildly out-of-date. Don’t get me wrong, most of it will still work fine. However, better and more user-friendly ways to work in R have emerged since.
At the end of the day statistics is math, and math people love communicating via formulas and mathematical notations. I don’t know about you, but unfortunately my brain shuts off whenever I see more than two mathematical formulas in a text. Therefore, I will try to keep the mathematics to a minimum and instead focus on examples and comparisons to explain ideas.
How-To guides tend to focus mainly on the “how” of statistical methods and not on the “why” behind it. While they might make you an expert about anything a given R (Python, …) function may be able to do, you still won’t know when and how to use them - or why you might get certain weird results back. Also, nowadays LLMs can teach you the “how” part probably better than any website ever will.
Most statistics websites are run by professional website owners. This is not meant as a dig at them - they can do whatever they want - but I personally hate seeing ads and cookie-popups when I just want to find out why my code won’t work. Therefore, you won’t find ads or cookie banners here, because I don’t care who you are. :)
The Core Structure
This also means that this isn’t necessarily designed to be read front-to-back. Instead, I would encourage you to jump to a specific subject whenever you encounter it, and work your way from there until the end of your issues.
However, I have still organized the book into subsections that (in my view) link together logically connected subjects and methods, going from simple methods and concepts on towards more complex implementations. However, the sprawling nature of statistical methods and coding means that forcing everything into a linear progression would do more harm than good. Therefore, while you may encounter basic concepts in a given topic (like “Plotting Data” in the main topic “Descriptive Statistics”), you will also find expansions of these contexts in separate topics (in this case “Data Plotting in R”).
To emphasize this further, everything that centers on using R, and less on statistical methodology has been organized into it’s own part in the structure. While the numbered parts deal with statistics, part R focuses on using R and RStudio. I would encourage everyone who is not familiar with R (or certain packages I frequently use) to look at this part whenever you’re not sure what I’m doing or why I’m doing things in a certain way - you’ll hopefully find an explanation there.
Last modified: 2023-09-20 11:14, R version 4.3.1
Source data for this page can be found here.