In statistics, you’ll often work with a random variable . This variable usually represents a set of data from a random trial or a statistical survey. A random variable has several possible outcomes. Which possible outcomes it has depends on the random variable.
Let be the number of dots you get when throwing a six-sided die. That makes a random variable that denotes the number of dots you can get when throwing the die, which makes 1, 2, 3, 4, 5 and 6 the possible outcomes of .
You buy a scratch ticket and scratch it. Let be the number of dollars you win. This makes a random variable. Most of the time you don’t win anything, so the most regular outcome of is 0, but if you win $ the outcome of is 10.
In the two examples above you can see that and are very different from each other. The variable has only six different outcomes, all of which are equally likely to occur. On the other hand, has a massive number of possible outcomes, and 0 is much more likely to occur than any other outcome.
Mathematically, we say that and have different probability distributions. A probability distribution is a rule that tells you how likely every outcome is. means “the probability of being ”.
In statistics, the term probabilitydistribution is used to describe a specific type of formula. When you read probabilitydistribution, think of it as a formula. Different probability distributions have to satisfy different criteria, just like every other formula you’ve come across in mathematics.
Note! It’s very important that you know what each probability distribution applies in each case. By determining this, you’ll know which distribution to use on a given set of data.
The probability distribution from Example 1 above is “the probability of 1 is , the probability of 2 is ” and so on. This can be represented in a table like this: