Margin of Error

With the pending USA election, the news is awash with poll results showing a candidate’s preference by voters. And when these results are presented, there is usually a caveat that “however, these results are within the margin of error” which actually makes the results a bit less conclusive. Why is that?

Without going through the plethora of maths that arrives at what follows, let me explain.

If we are trying to determine a parameter of a population (like the percentage of people that prefer a candidate), we need to ask everyone in a population in order to know the answer exactly. This is impossible in many situations, especially in the USA where all the people who will vote cannot be asked the question. So a sample of voters must be used. Now there are a lot of things to be considered to make sure that the sample used is truly random (that is, not biased), but let’s assume going forward, that the samples used are random.

First, without the math, for large samples, the distribution of the parameter being measured is approximately normal. This is fancy statistical wording that means the values one gets taking sample after sample will follow a bell curve:

This curve is adjusted so that the probability of the parameter of interest between two values is the area under the curve. This means that the area under the entire curve must be 100%. So the probability that the parameter is between a and b, based on the sample, is the shaded area below:

If this is 68% of the total area, then that is the probability that the parameter being measured is between a and b.

Now let’s get to the current scenario. Suppose 1000 people are surveyed and 52% prefer candidate A and 48% prefer candidate B. Let’s look at the associated bell curve for candidate A:

A lot of math is involved here and a lot of assumptions (though they are reasonable). Notice that the curve is centered at the sample result of 52% (0.52 is the decimal equivalent). The range 0.49 to 0. 55, which is 0.52 – 0.03 and 0.52 + 0.03, are the numbers that include 95% of the area (that is probability). Without going through a lot of theory here, this range of numbers is the 95% confidence interval for this sample. So a statistician can say “based on this sample, I am 95% confident that the true percentage of all voters who support candidate A is between 49% and 55%”. This means that based on this sample, the true preference for candidate A can be as low as 49%. The number 0.03 which is added and subtracted from the sample result is called the margin of error. This 95% confidence interval is the most common one used.

Now let’s look at the bell curve for candidate B and its associated confidence interval:

Notice that the 95% confidence interval for this result is 45% to 51%. That is , based on this sample, we can be 95% confident that the true percentage of all voters who support candidate B is between 45% and 51%. This means that the true preference for candidate B can be as high as 51%. And this is higher than the possible low preference of candidate A at 49%. That means that even though the sample shows that candidate A is preferred, the difference between the two values are not significant enough to make the statement that candidate A is truly preferred at the 95% confidence interval. In other words, the result is within the margin of error.

Now let’s say that the survey result was that candidate A was preferred at 55% and candidate B at 45%. The confidence interval of candidate A’s bell curve would be 52% to 58%. Candidate B’s confidence interval would be 42% to 48%. So based on this sample, the highest that the true preference of candidate B is 48% and the lowest preference of candidate A would be 52%. There is no overlap here so this would be significant enough to say that all voters do prefer candidate A. That is, the result is outside the margin of error. And when you see results with that statement, that is much better for candidate A.

Statistics – Combinations (Selections)

In my last post, I described how you can find all the ways to arrange x things from a group of n things. Here, order matters, and the equation to calculate this is

\[P\left(n,x\right)=\frac{n!}{\left(n-x\right)!}\]

If this looks strange, please read my last post. Now let’s talk about counting things where order does not matter, for example, picking a team of players.

The math term for this is combinations. Let’s introduce this with an example. How may ways can you arrange the letters A, B, and C? From my last post, you know that this is 3! = 6. ABC and CBA are two different arrangements. Now how many ways can you select all 3 letters? Well, there is only one way that can be done. ABC and CBA are the same selection so are only counted once. Notice that for a given n and x, there are fewer selections than arrangements. In this example, there are 3! = 6 times more arrangements than selections.

Now let’s modify this example. Suppose we want to select 3 letters from ABCDE. For any 3 of the letters chosen, there will be 3! times more arrangements than selections, which means that if we use the permutation formula above to answer this question, the answer would be 3! times too large. Generalising this, there are x! times more arrangements than selections for a given n and x. This allows us to modify the formula above by dividing it by x! to get the combinations formula

\[C\left(n,x\right)=nCx=\left(\begin{matrix}n\\x\\\end{matrix}\right)=\frac{n!}{x!\left(n-x\right)!}\]

The left side are some of the different notations used and the right side is the actual formula. As with permutations, you can use a CAS calculator to do this calculation with the nCr function. Selecting 3 letters from 5, there are

\[C\left(5,\ 3\right)=\frac{5!}{3!\left(5-3\right)!}=10\]

ways to do that. This would also answer the questions: how many ways can you select a team of 3 people from 5 people or how many 3-card hands can be dealt from a deck of 5 cards?

Speaking of cards, how many 5-card poker hands can be dealt from a standard deck of 52 cards?

\[C\left(52,\ 5\right)=\frac{52!}{5!\left(52-5\right)!}=2,598,960\]

Now let’s put this in practice. There are many lotto games around based on picking 6 numbers out of 45. Let’s first calculate how many ways you can select 6 numbers out of 45:

\[C\left(45,\ 6\right)=\frac{45!}{6!\left(45-6\right)!}=8,145,060\]

From my post on basic probability, the probability of your lotto ticket with a single set of 6 numbers winning is

\[\text{probabilty of winning}=\frac{1}{8145060}=0.000000123=0.0000123\text{%}\]

Now if you buy a block of 50 numbers, how much does that improve your chances of winning? This is a binomial distribution problem which is beyond the scope of this post, but to calculate that, it uses the probability calculated above to get 0.000614% chance that at least one of the numbers wins. That’s a 1 out of 162,902 chances to win with a 50 pick lotto card. In Australia, there is a 1 in 12,000 chance of being hit by lightning. Just make sure you’re not standing outside when you buy your ticket.

Statistics – Permutations (Arrangements)

I will discuss counting two types of picking a group of items from a large number of items. These two types are called permutations (also called arrangements) and combinations (also called selections).

Combinations are when the order of the picking does not matter. For example, when picking 5 cards from a 52 card deck, the order does not matter: Ace, 2, 3, 4, 5 is the same hand as 5, 4, 3, 2, Ace (assuming the suits are the same). Or another example is how many 5 player teams can be made from 30 people. I will discuss combinations in a subsequent post. This post is about permutations, where the order of things picked does matter.

An example of a permutation problem is how many ways can you arrange 5 guests on a table from a group of 50 people. Here, order matters: Adam, Betty, Charlie, David, Eddie arranged in that order is different from Eddie, David, Charlie, Betty, Adam.

Let’s look at a simple example and extrapolate from that.

From 5 people, how many ways can we seat 3 of them? There are 5 ways to pick the first person. Now there are only 4 people left, so there are 4 ways to pick the next person. Now there are 3 people left so we only have 3 ways to pick the last person. So the number of ways is the 5 ways to pick the first times the 4 ways to pick the second times the 3 ways to pick the last person: 5 × 4 × 3 = 60 ways to arrange 3 people from a group of 5. If you follow this pattern, you can arrange 5 people from a group of 10, 10 × 9 × 8 × 7 × 6 = 30,240 ways.

This can be generalised: how many ways can you arrange x things from n things. Before I show the formula for this, I need to explain new notation.

Using a “!” after a number has a meaning in maths. This is called a factorial. As an example, 5! = 5 × 4 × 3 × 2 × 1 = 120. So a factorial is successively multiplying one less number. Factorials increase quickly. 40! is a number slightly greater than 8 followed by 47 zeroes. Factorials are used in maths formulas frequently and in order to make these consistent, 0! is defined as 1. Doesn’t look right but this must defined this way as we will see.

Looking at the examples above, we have partial factorials: instead of 5 × 4 ×3 × 2 × 1, we have 5 × 4 × 3, or instead of 10 × 9 × 8 × 7 × 6 × 5 × 4 × 3 × 2 × 1 we have 10 × 9 × 8 × 7 × 6. Notice that 5! can be thought of as 5 × 3 × 4 × 2! and 10! can be 10 × 9 × 8 × 7 × 6 × 5!. In the first example, the “2” is 5 – 3, that is the number of people minus the number in the arrangement. In the second, the “5” is 10 – 5, that is the number of people minus the number in the arrangement. If we let n be the number of total things and x the number of the things to be arranged, then the formula to compute this in general is:

\[P\left(n,x\right)={^n}P_x=P_x^n=nPx=\frac{n!}{\left(n-x\right)!}\]

The formula is the far right expression, the notations on the left are the common notations used in different places that mean the same thing. So applying this to our two examples:

\[P(5,3)=\frac{5!}{\left(5-3\right)!}=\frac{5\times4\times3\times2!}{2!}=5\times4\times3=60\] \[P\left(10,5\right)=\frac{10!}{\left(10-5\right)!}=\frac{10\times9\times8\times7\times6\times5!}{5!}=10\times9\times8\times7\times6=30,240\]

The 2! and the 5! cancel out in the fractions and we get the result we want. If we wanted to arrange 5 things from a group of 5, we use the definition that 0! = 1:

\[P\left(5,5\right)=\frac{5!}{\left(5-5\right)!}=\frac{5!}{0!}=\frac{5!}{1}=5\times4\times3\times2\times1=120\]

Because of how large factorials grow, if calculating this formula by hand, it is better to first cancel the (nx)! part from the numerator, then calculate the result.

If you are fortunate to own a CAS calculator, using the permutation function nPr gets the same result with less work: nPr(10,5) = 30,240.

Now this does not directly answer questions about picks where order does not matter, like the number of poker hands. That is a combination question and I will talk about that in my next post.

Statistics – Probability of Conditional Events

This is about the probability of an event given some information. What follows assumes you know how to calculate basic probabilities (two posts ago) and the probability of an intersection of events (my last post).

Let’s start with an example. What is the probability of rolling a 6 on the roll of a die? From basic probability, we know that it is the number of ways to roll a 6 (only 1 way) divided by the number of total things that can happen (6). So the probability is 1/6. Now what if the die is rolled and a friend cheats by telling you that the number rolled is odd. Intuitively, you would say that the probability is 0 as 6 is an even number, so the additional information tells you that a 6 is not possible. The probability of a 6 and an odd number is 0 because the number of ways you can roll a 6 and an odd number is 0.

Now if the die is rolled and your friend says that the number rolled is even, what is the probability that a 6 was rolled? Intuitively, knowing that the number is even should increase the chances that a 6 was rolled. We can answer this using the basic probability formula: the number of ways to roll a 6 and an even number divided by the number of even numbers. Knowing that the number is even reduces the number of total things that can happen from 6 to 3. And the number of ways you can roll a 6 and an even number is 1. So the new probability, thanks to your friend, is 1/3.

As always, maths has notation for this. Let A and B be two events. Then the notation for “the probability of event A given (or on condition that) B occurred” is P(A|B). From the examples above, if A is the event of rolling a 6, and B is the event of rolling an odd number or it’s the event of rolling an even number, then the equation to calculate this is

\[P\left(A\middle|B\right)=\frac{n\left(A\cap B\right)}{n\left(B\right)}\]

From my last two posts, remember that n(something) means the number of ways that something can occur, and the symbol ∩ means intersection or “and”.

This equation can be shown to be equivalent to

\[P\left(A\middle|B\right)=\frac{P\left(A\cap B\right)}{P\left(B\right)}\]

where the probabilities are used instead of the numbers. This can be rearranged to give what is called the multiplication rule of probability

\[P\left(A\cap B\right)=P\left(A\middle|B\right)P\left(B\right)\]

So if P(AB) = 0.3 and P(B) = 0.7, then

\[P\left(A\middle|B\right)=\frac{P\left(A\cap B\right)}{P\left(B\right)}=\frac{0.3}{0.7}=\frac{3}{7}\]

Another example that shows conditional probability and the multiplication rule of probability in action is the following:

There is a bag with 10 marbles in it: 4 red and 6 blue ones. Two marbles are picked from the bag without replacing the first ball picked. “Without replacing” is important because the probability of picking the second ball’s color is affected by the first ball picked. If the first ball was replaced, the probability of the second ball’s color would not depend in the first ball’s color, that is, the two picks would be independent of each other.

So let’s look at some of the probabilities in this experiment. The probability that the first ball is red is P(R₁) =4/10 = 2/5. Now the probability of the second ball picked is dependent on that as there is 1 less red ball and 1 less ball in total. So the probability that the second ball is red is P(R₂|R₁) = 3/9 = 1/3 because there are only 3 red balls left out of the 9 balls left. Likewise, P(B₂|R₁) =6/9 = 2/3. For simple experiments like this, tree diagrams are often used to get a complete picture of all the possibilities:

The last column of combination probabilities uses the multiplication rule previously stated. Using tree diagrams like this, you can answer many questions about the experiment by adding these probabilities:

  1. What is the probability of picking just one red marble?
\[P\left({B_2\cap R}_1\right)+P\left({R_2\cap B}_1\right)=\frac{4}{15}+\frac{4}{15}=\frac{8}{15}\]

2. What is the probability of picking two marbles of the same color?

\[P\left({R_2\cap R}_1\right)+\ P\left({B_2\cap B}_1\right)=\frac{2}{15}+\frac{1}{3}=\frac{7}{15}\]

3. What is the probability of picking at least one red marble?

\[P\left({R_2\cap R}_1\right)+P\left({B_2\cap R}_1\right)+P\left({R_2\cap B}_1\right)=\frac{2}{15}+\frac{4}{15}+\frac{4}{15}=\frac{10}{15}=\frac{2}{3}\]

Note that since the last column includes all the possible ways this experiment can go, all of these probabilities add up to 1. So to answer question 3, a more efficient way to calculate the answer is to subtract the one possibility excluded from 1:

\[1-P\left({B_2\cap B}_1\right)=1-\frac{1}{3}=\frac{2}{3}\]

Counting the number of ways an event can happen in simple experiments like this is easy to do in your head. But what about questions like “how many poker hands (5 cards) can be made from a standard deck of 52 cards?”. Not so easy. So next time, I will talk about how we can “count” large possibilities like this.

Statistics – Probability of Combined Events

I ended my last post showing the probability of picking a type of card from a standard deck of 52 cards. For example, if the event of interest, A, is picking a Jack, then the probability of picking a Jack from a shuffled deck of cards is

\[P\left(A\right)=\frac{4}{52}=\frac{1}{13}\]

because there are 4 ways to pick a Jack out of 52 cards. Now let’s consider probabilities of events like “picking a Jack or a Heart” or “a face card and a Heart”.

If we let events A be picking a Jack, B be picking a Heart, and C be picking a face card (Jack, Queen, or King), then the maths notation for these statements are

\[P\left(A\cup B\right)=\mathrm{probability\ of\ picking\ a\ Jack\ or\ a\ Heart}\] \[P\left(B\cap C\right)=\mathrm{probability\ of\ picking\ a\ face\ card\ and\ a\ Heart}\]

The symbol “∪” stands for the union of two events, but in English, you can use the word “or”: AB = “A union B” or “A or B“. The symbol “∩” stands for the intersection of two events, but in English, you can use the word “and”: BC = “B intersection C” or “B and C“. These concepts are easily seen in a Venn diagram:

Circle A is the set of all Jacks and circle B is the set of all Hearts. Now the probability of picking a card from set A is 4/52. The probability of picking a card from set B is 13/52. You may be tempted so say that the probability of A or B is the sum of the two individual probabilities. But both of these probabilities include the Jack of Hearts so it is used twice. We have to subtract out this intersection of the two probabilities, so in maths notation:

\[P\left(A\cup B\right)=P\left(A\right)+P\left(B\right)-P\left(A\cap B\right)\]

This equation can be rearranged to show that the probability of the intersection of the two events is equal to the sum of the individual probabilities minus the probability of the union:

\[P\left(A\cap B\right)=P\left(A\right)+P\left(B\right)-P\left(A\cup B\right)\]

These two equations are different forms of what is called the addition rule of probability.

So P(AB) = 4/52 +13/52 – 1/52 = 16/52, because P(AB) is the probability of a Jack and a Heart. Only one card satisfies this, the Jack of Hearts, so the probability of that is 1/52.

Now let’s define event D as picking a Diamond and consider the probability of picking a Heart and a Diamond, P(BD). This is clearly 0 as a card cannot be both suits. The associated Venn diagram looks like:

Events like this are called mutually exclusive, that is, you can pick one or the other, the picked card cannot be both. For mutually exclusive events:

\[P\left(B\cup D\right)=P\left(B\right)+P\left(D\right)\ \mathrm{and}\ P\left(B\cap D\right) =0\]

In my next post, I will discuss what is called conditional probabilities and explore the probability of picking a Jack given that the card is a Heart.

Statistics – Basic Probability

As I am covering this topic now with many of my students, let’s start a series on statistics.

The first concept taught when introducing statistics to students is that of probability. Let’s start with the experiment of rolling a die. I italicise experiment because it is a formal term in statistics. I will italicise other terms in this post.

If we are interested in the outcome or the event of rolling a “3”, what is the probability of that occurring? As there are six possible outcomes, all equally likely, and “3” is just one of them, then the probability is 1/6 or 1 out of 6. As maths likes to use shorthand notation to represent concepts, lets notationise (my word) this.

Let A represent the event of rolling a “3”. The probability of this is represented by P(A). The probability of rolling a “3” is the number of ways a “3” can occur (one way) divided by the total number of things that can occur (six). So to generalise this, for experiments where all outcomes are equally likely, the probability of an event A is

\[P\left(A\right)=\frac{\mathrm{number\ of\ ways}\ A\ \mathrm{can\ occur}\ }{\mathrm{total\ number\ of\ things\ that\ can\ occur}}=\frac{n\left(A\right)}{n\left(\xi\right)}\]

Now I’ve introduced some new notation here. n(A) is notation that means “number of ways A can occur”. The Greek letter xi, ????, is the set of all the things that can happen. In this case, ???? ={1, 2, 3, 4, 5, 6}. This is also called the sample space of the experiment. So n(????) = 6. In our experiment and the event of interest (rolling a “3”), n(A) = 1 and n(????) = 6 so P(A) = 1/6.

So what is the probability of rolling an even number? Here, A = “rolling an even number”. As there are three even numbers, or three ways, that this can occur, then P(A) = 3/6 = 1/2.

Let me say a few general things about probability. The probability of an event is always a number between 0 and 1 including 0 and 1. At the extreme ends, if P(A) = 0, then event A has no chance of occurring. So in our experiment, if A is the event of rolling a “7”, then P(A) = 0. If P(A) = 1, then the event is a certainty to happen. If A is the event of rolling an odd or even number, then P(A) = 1.

Now let’s look at a slightly more complex experiment: picking a card from a standard deck of cards. A standard deck of 52 cards has four suits (hearts, clubs, diamonds, spades) of 13 cards each. Each suit consists of an Ace, Jack, Queen, King, and numbered cards 2 to 10. The Jack, Queen, and King cards are called face cards. So this experiment is choosing one card out of a shuffled deck of cards.

If A = choosing a Heart, then P(A) = 13/52 = 1/4 because there are 13 ways to pick a Heart out of 52 ways any card can be picked. Now let A = choosing a Jack. Here P(A) = 4/52 = 1/13 as there are 4 Jacks in a deck of cards. Now let A = picking a face card. Then P(A) = 12/52 =3/13 as there are 12 face cards in a deck.

In my next post, we’ll explore how to handle more complex events like choosing a Queen or a Heart.

Learning to Count Again

Let’s define an event A, flipping heads on the flip of a coin for example. What is the probability of that event occurring? I will use the common notation of P(A) to represent the probability of an event A. The basic way to calculate a probability is to divide the number of ways A can occur by the number of ways anything can occur. That is,

\[\text{P(A)} =\frac{\text{Number of ways A can occur}}{\text{Number of ways anything can occur}}\]

In the case of getting a heads from a flip of a coin, there is only one way to get a heads and there are two possibilities. So the probability is 1/2.

If we flip two coins and we now let our event A be two heads, then the counting of the ways anything can happen takes just a little more thought. The possibilities are: HH, HT, TH, and TT. So there is only one way to get two heads but there are four possibilities. So the probability is 1/4. If the event is “at least one head”, there are three ways that can happen out of the total of four possibilities, so the probability is 3/4.

We can increase the complexity of our experiment and flip three coins or ask questions about choosing certain cards in a standard deck. The counting for the numerator and the dominator gets harder but it is still possible with a little more thought. But what if I asked “what is the probability of getting a flush (all cards of the same suit) in 5 cards randomly selected from a deck of cards?” or “A four-digit number (with no repetitions) is to be formed from the set of digits {1, 2, 3, 4, 5, 6}. Find the probability that the number is even.” Now the counting gets much harder. But fortunately, there are ways to handle this

Selection or Arrangement

Let’s say we have 8 people standing around just waiting for a math problem to show up. Someone wants to form a team of 5 people from these 8 and another person wants to arrange 5 of these people around a desk. How many different teams can be made and how many different settings around the desk can be done?

Notice that in the team question, it doesn’t matter what order you pick the players but in the table question, order does matter. Just using letters to represent the people, ABCDE and EDCBA are the same for the team question so would only be counted once but these are two separate arrangements in the table question. The table question is an arrangement question whereas the team question is a selection question. In textbooks, arrangements are often called permutations and selections are called combinations. It doesn’t matter what they are called, you need to know which type you have in order to count the number of possibilities correctly.

Arrangements

Let’s look at the table question first. You have 8 ways to choose the first person, but when you do, there are only 7 people left to choose as the second person. Then you only have 6 left for the third position, 5 left for the fourth and 4 left for the fifth. So the total number of arrangements is 8×7×6×5×4 = 6720. Multiplying numbers that sequentially decrease by one is a common thing when doing these kind of problems so, as is so common in maths, a shorthand notation was created. 8×7×6×5×4×3×2×1 is represented as 8! and is called “8 factorial”. But for our arrangement problem, we are missing the 3×2×1 part. Notice that 3×2×1 = 3!. So another way to show the solution is 8!/3!. Also notice that 8 – 5 = 3.

We can generalise this. If you have n things and want to arrange them r at a time, then the number of arrangements is n!/(nr)!. Once again, even this is too much to write for our lazy mathematicians, so this is given a shortcut notation where the P stands for permutation:

\[
^{n} P_{r} =\frac{n!}{( n-r) !}
\]

In a CAS calculator, the function “nPr” is used to calculate permutations, so for our problem nPr(8,5) = 6720.

By the way, what if we wanted to arrange all 8 people? Then using the permutation formula, we get 8!/(8-8)! = 8!/0!. This looks illegal but mathematicians foresaw this and defined 0! as being equal to 1. Must be nice to be able to make your own rules.

Selections

Now let’s look at the team question. Using letters again, we are now in the scenario where ABCDE and EDCBA are the same team and this should only be counted once. So we would expect that the number of selections using the same n and r would be smaller than the number of arrangements. In fact this is true. So for our team selection, if we use our arrangement formula, each team has 5! different arrangements and the calculated number is 5! too large. So if we divide our arrangement total of 6720 by 5! = 120, we would get the correct number for the number of teams possible, 6720/120 = 56. To generalise, if you have n things and want to select them r at a time without regard to order, then the number of arrangements is n!/r!(nr)!. Again, there is shorthand for this where the C stands for combination:

\[^{n} C_{r} =\frac{^{n} P_{r}}{r!} =\frac{n!}{r!( n-r) !}\]

In a CAS calculator, the function “nCr” is used to calculate combinations, so for our problem nCr(8,5) = 56.

Poker

So at the beginning of this post, I posed a hypothetical question about the probability of being dealt a flush hand of 5 cards out of a deck of 52 cards. As card order does not matter, this is a selection (combination) problem. We need to find the number of ways to get 5 card flush hands and the total number of possible 5 card hands. I assume you are familiar with the standard deck of cards that consists of 4 suits of 13 cards each.

First let’s find the total number of 5 card hands there are. Here n = 52 and r = 5:

\[^{52} C_{5} =\frac{52!}{5!( 52-5) !} =2,598,960\]

his will be our denominator to use in the probability formula. For the numerator, we need to find how many ways you can get a 5 card flush. You could get 5 clubs, 5 hearts, 5 diamonds, or 5 spades. Each one of these is a selection of 13 cards, 5 at a time or:

\[^{13} C_{5} =\frac{13!}{5!( 13-5) !} =1,287\]

We need to multiply this by 4 since there are 4 suits, 1287×4 = 5148. So the probability of being dealt a flush in 5 card poker is P(Flush) = 5148/2598960 = 0.00198 = 0.198% or about once in 505 hands. I wouldn’t bet on it.