Probability, Part 6, The Problem with Probability

In my last few posts, I’ve talked about probability and how to calculated a basic probability:

Probability = \[\frac{{\mathrm{Number}}\hspace{0.33em}{\mathrm{of}}\hspace{0.33em}{\mathrm{favorable}}\hspace{0.33em}{\mathrm{outcomes}}}{{\mathrm{Total}}\hspace{0.33em}{\mathrm{number}}\hspace{0.33em}{\mathrm{of}}\hspace{0.33em}{\mathrm{possible}}\hspace{0.33em}{\mathrm{outcome}}{s}}\]

This formula is simple if you know the number of favourable outcomes and the number of possible outcomes. This works well if asking questions like what is the probability of rolling  a 7 with a pair of dice. To calculate the number of total outcomes, there are 6 possible ways a single die can be thrown, and for each of these, the other die can have 6 possible value. So the total will be 6 × 6 or 36. This illustrates the multiplication rule for counting things:

If there are m ways for one thing to occur and n ways for a second thing to occur, then there are m × n ways to do both.

Manually counting the ways to get a 7 where the first number is from die 1 and the second from die 2 gives:

1 + 6, 2 + 5, 3 + 4, 4 + 3, 5 + 2, and 6 + 1. Six ways

So the probability of rolling a 7 is 6/36 or 1/6.

Now what if I asked what is the probability of getting four aces in a 5-card poker hand? How do you even begin to count the number of possible poker hands? There are two ways to count large possibilities like this: combinations and permutations.

A combination is the number of ways where a collection of objects can be arranged where you are not concerned with order. For example, in the card example, a hand of 2, 3, 4, 5, and 6 of hearts would be the same as a 6, 5, 4, 3, an 2 of hearts and you would only want to count these two possibilities as one along with any other arrangement of these five cards. A permutation is where order does count and these two card combinations would be counted as two permutations.

In our card example, order doesn’t count, so we want the number of  combinations of taking 52 cards, 5 at a time. Fortunately, there is a formula and notation used to simplify this. Before I present this, there is another math operation that needs to be explained: factorials.

You may have a calculator with a “!” symbol or “x!” on one of the keys. This is a factorial operation. A factorial is successively multiplying an integer by one less for each factor. For example, 5! = 5 × 4 × 3 × 2 ×1 = 120. Factorials get large very quickly. For example, 30! is 265252859812191058636308480000000. To make the formulas using factorials consistent, a special definition 0! = 1 is made.

So the notation for the number of r combinations of n objects is \[
{}_{n}{C}_{r}
\] or more commonly C(nr). So in our case, we want to calculate C(52,5), this is the number of possible 5-card combinations out of a deck of 52 cards. The general formula for combinations is

\[{C}{(}{n}{,}\hspace{0.33em}{r}{)}\hspace{0.33em}{=}\hspace{0.33em}\frac{n!}{{r}{!(}{n}\hspace{0.33em}{-}\hspace{0.33em}{r}{)!}}\]

In our poker hand example, the number of possible poker hands is

\[{C}{(}{52}{,}\hspace{0.33em}\hspace{0.33em}{5}{)}\hspace{0.33em}{=}\hspace{0.33em}\frac{52!}{{5}{!(}{52}\hspace{0.33em}{-}\hspace{0.33em}{5}{)!}}\hspace{0.33em}{=}\hspace{0.33em}\frac{52!}{5!(47)!}\]

Now before you go off and calculate this, remember how large factorials can get? Many calculators cannot keep the number of digits necessary to accurately store very large numbers and the accuracy of the calculation will be poor. So when dealing with combination and permutation formulas, it is always best to simplify before calculating the answer. See if you can see where we can simplify the expression on the right side. I will continue this example in my next post.

Probability, Part 5, The Monty Hall Problem

In the 1960’s (so I’m told), a new game show on TV appeared called Let’s Make A Deal. The show was hosted by Monty Hall. In this show, a contestant was shown three doors on the stage and was asked to choose one. One door had a great prize like a car. The other two had less desirable prizes like a goat. Well right away, if you’ve been diligent reading my posts, you know that the contestant has a one-third chance of winning.

However, Monty would then open one of the doors not picked an expose a less desirable prize. Monty would then ask the contestant if she would like to switch. The question is, should she? Does it make a difference or is the probability still one-third either way? Well let’s see the two probability trees for each strategy: stay or switch.

First the “stay” strategy. Let’s assume that the car is behind door 1 so we can label the branches appropriately. The second level branches have probability 0 or 1, because once you choose a door, the strategy of staying or switching determines exactly what you will end up with:

So if you look at all the scenarios (branches) that end up with a car and add those probabilities together, you get 1/3 as expected. Please see my previous post regarding probabilities trees if needed.

Now let’s create a tree where we switch doors after Monty shows what’s behind one of the “loser” doors:

Now add up the probabilities that end up with a car and you get 2/3! You double your chances of winning if you switch. You see, Monty adds more information to the problem by exposing one of the loser doors and you take advantage of this by switching doors. Because you have 2/3 chance of initially picking a goat, if you do pick a goat and Monty exposes the other door with a goat, you will have no choice but to end up with the car if you switch.

Probability, Part 4

I would like to introduce probability trees. These help compute more complex probabilities and combine the addition and multiplication rules we covered earlier. Let’s look at a probability tree for two tosses of a coin:

To create the tree, you start with a branch for the first set of possible events for the first trial, in this case heads or tails, then add other branches for all the possibilities for the second trial and so on. You also include the probabilities on each branch segment.

Travelling along a branch depicts a joint probability. For example, what is the the probability of tossing two heads?  Travelling along the branches for two heads, you hit two probabilities, each 0.5. As we saw before, this is a joint probability so we multiply these together to get 0.25. So along a branch, you multiply the probabilities.

What if I asked what is the probability of getting two heads or two tails?  The only two branch paths that satisfy this requirement are the top one and the bottom one, each with a calculated branch probability of 0.25. These are then added, since this is an OR probability so the addition rule applies:

Adding these probabilities gives the result of 0.50.

Now let’s look at the marble experiment: picking in succession two marbles in a bag with 10 red and 10 blue marbles. Now we can build the probability tree but to know what the probabilities are, we need to know if the first marble is replaced or not. From the last post, you saw that this affects the probabilities of the second pick. I’ll leave it as an exercise for you to build the tree for the “with replacement” case. It will be similar to the coin toss tree.

Without replacement, the tree will look like this:

I’ve kept the probabilities as fractions to make it clearer where they came from. See my last post if needed. Notice that the last column of probabilities add up to 1 as they should since all possible branches have been included.

So what is the probability of picking two blue marbles? This can be read directly from the tree as 0.237. Now let me ask, what has the greater probability: picking two marbles of the same colour or two of different colours? If you add the probabilities of picking two blue or two red, you get 0.474, not quite 50% as you might expect. To get the probability of getting mixed marbles, you can either add the two tree probabilities of 0.263 or subtract the 0.474 probability we just calculated from 1 as this is the only other possibility as the two events are mutually exclusive. Both ways will give the result 0.526. So which possibility would you choose if you were a betting person?

In my next post, I will use probability trees to show the surprising result of the Monte Hall experiment.

Probability, Part 3

So what is the probability of tossing 3 heads in a row flipping a coin? Well another probability rule is called the joint probability rule. For independent events (that is one event does not affect the probability of the other), the rule is

P(A and B) = P(A) × P(B)

The result of flipping a coin does not affect the next flip of the coin, so these would be independent events and we can use this rule. The probability of flipping a heads is 0.5, so the probability of flipping three heads in a row is

P(flipping 3 heads) = 0.5 × 0.5 × 0.5 = 0.125 or 12.5%

Now let’s look at another experiment. Suppose you have a bag of 20 marbles, 10 blue ones and 10 red ones. The probability of picking a red marble is 10/20 or 0.5. If you replace the marble, shake the bag and redo the experiment, the probability of picking a red marble is still the same, that is the two experiments would be independent. So the probability of picking two red marbles in a row this way is 0.5 × 0.5 = 0.25. But what if you did not replace the marble? Before the second pick, the bag now has 19 marbles, 9 red and 10 blue so the probabilities of the second pick are affected by the first pick. This means that the two events are dependent.

If two events are dependent, say A depends on B, the way to show this is P(A|B). This means what is the probability that A occurs given that B occurred.

For dependent events, the joint probability rule is modified slightly:

P(A and B) = P(A|B) × P(B)

So you still just multiply the probabilities, but you must adjust the probability of A if B occurs.

Now back to the marble experiment without replacing the marble. What is the probability of picking two red marbles in a row?

Well for the first pick, we already know that the probability is 0.5. But for the second pick, the probability is 9/19 because there are only 9 red marbles now and a total of 19 marbles. So the probability of picking two red marbles without replacement is

P(2 red marbles) = P(second red marble|first marble is red) × P( first red marble) = 9/19 × 0.5 = 0.237. So the probability is slightly less picking two red marbles without replacement than it is with replacement.

This sets us up to do much more complex probabilities. In my next post, I’ll discuss probability trees.

Probability, Part 2

So we are discussing probability and so far, I’ve just used some simple examples where I used the rule:

Probability = \[
\frac{{\mathrm{Number}}\hspace{0.33em}{\mathrm{of}}\hspace{0.33em}{\mathrm{favorable}}\hspace{0.33em}{\mathrm{outcomes}}}{{\mathrm{Total}}\hspace{0.33em}{\mathrm{number}}\hspace{0.33em}{\mathrm{of}}\hspace{0.33em}{\mathrm{possible}}\hspace{0.33em}{\mathrm{outcome}}{s}}
\]

Now I would like to be able to show more complex examples , but first a definition:

Event – a collection of one or more outcomes of an experiment.

An experiment is flipping a coin, rolling a die, picking a card, etc. The outcome is what happens, that is the result of the experiment. To save  writing, we use the following notation:

P(A) is the probability of event A. Event A can be flipping heads, rolling a 2, picking the ace of hearts. Other letters can be used to represent other events.

So for example, what is the probability of rolling a die and getting a 1 or a 6? Well one way is to note that the number of favorable outcomes is 2 and the number of possibilities is 6, so

P(rolling a 1 or a 6) = 2/6 = 1/3

Another way is by using the probability addition rule:

P(A or B) = P(A) + P(B)

Event A can be rolling a 1 and event B can be rolling a 6. We know that the probability of rolling any single number is 1/6 so,

P(A or B) = 1/6 + 1/6 = 2/6 = 1/3

The addition rule only works for mutually exclusive events. Rolling a 1 means that rolling a 6 is impossible and vice versa.

What if I asked what is the probability of not rolling a 1 or a 6? Well there is something called the complement rule that is useful:

P(~A) = 1 – P(A), where ~A means not A

If A is rolling a 1 or a 6, then ~A is rolling any other number. But since we now know what the probability of throwing a 1 or a 6 is, we can use this to find the probability of not throwing a 1 or a 6:

P(not throwing a 1 or a 6) = 1 – P(rolling a 1 or a 6) = 1 – 1/3 = 2/3

In my next post, I’ll explore questions like what is the probability of tossing 3 heads in a row.

 

Probability, Part 1

So as I have acquired several new business statistics students, I thought I’d switch gears and talk about statistics. The first topic in statistics to get acquainted with is probability.

So what does it mean when the weatherperson says there’s an 80% chance of rain today? This is a probability expressed as a percentage. It means that given the weather conditions of today, 80 out of a hundred days like this will experience rain (on average). I say “on average” because a probability is just an indication of the likelihood of something occurring. Unless the probability is 0 (the event just will not happen) or 1 (the event will happen for sure), then the probability just expresses a likelihood of something happening. By the way, a probability is technically a number between 0 and 1. The 80% probability is the percentage equivalent to 0.80.

So most people intuitively know that the chance of flipping a head in a coin toss is 0.50 (50%). That does not mean that if you flip two coins, you only get one heads and one tails. But if you did this experiment 100 times, and counted the times you only get 1 heads, the average would tend to be 50, half or 50% of 100.

So how do you calculate a probability? The classic definition is

Probability = \[
\frac{{\mathrm{Number}}\hspace{0.33em}{\mathrm{of}}\hspace{0.33em}{\mathrm{favorable}}\hspace{0.33em}{\mathrm{outcomes}}}{{\mathrm{Total}}\hspace{0.33em}{\mathrm{number}}\hspace{0.33em}{\mathrm{of}}\hspace{0.33em}{\mathrm{possible}}\hspace{0.33em}{\mathrm{outcome}}{s}}
\]

So if we are interested in the probability of a heads in a single coin toss, the answer is

\[
\frac{{\mathrm{One}}\hspace{0.33em}{\mathrm{favorable}}\hspace{0.33em}{\mathrm{outcome}}\hspace{0.33em}{(}{\mathrm{h}}{\mathrm{e}}{\mathrm{a}}{\mathrm{d}}{\mathrm{s}}{)}}{{\mathrm{Possible}}\hspace{0.33em}{\mathrm{outcomes}}\hspace{0.33em}{(}{\mathrm{h}}{\mathrm{e}}{\mathrm{a}}{\mathrm{d}}{\mathrm{s}}\hspace{0.33em}{\mathrm{or}}\hspace{0.33em}{\mathrm{t}}{\mathrm{a}}{\mathrm{i}}{\mathrm{l}}{\mathrm{s}}{)}}\hspace{0.33em}{=}\hspace{0.33em}\frac{1}{2}\hspace{0.33em}{=}\hspace{0.33em}{0}{.}{50}
\]

as expected.

What about the probability of getting a 1 on the roll of a fair die? Well the number of favorable outcomes is 1 and the total number of possibilities is 6. So the probability is \[
\frac{1}{6}
\] which you can leave as a fraction.

What about the probability of getting an odd number in a singe throw of a die? The number of favorable outcomes is 3 (1, 3 or 5) and the total number of outcomes is 6. So the probability is \[
\frac{3}{6}
\] or \[
\frac{1}{2}
\] if you simplify this fraction. And it makes sense that on average, half the time when you throw a die, the number is odd and half the time the number is even.

This highlights a concept that the probability of all mutually exclusive and collectively exhaustive  outcomes has to be 1 as something has to happen. Mutually exclusive events are events where no other event can occur. For example throwing an odd number means that throwing an even number cannot occur. Collectively exhaustive means that all of the possible events are included. For example, throwing an odd number or throwing an even number are collectively exhaustive as there is no other possibility.

What is the probability of randomly selecting a face card from a shuffled deck of cards (no jokers)? Well there are 12 face cards in a deck so the probability is  \[
\frac{12}{52}\] or \[
\frac{3}{13}\]

So far, the examples have been simple experiments with simple outcomes that can be easily counted so that the probability fraction can be easily derived. The complication comes when it is not so easy to count these. I will expand on this in forthcoming posts.