This is a good one because it's so counter-intuitive:
Imagine an urn filled with balls, two-thirds of which are of one color and one-third of which are of another. One individual has drawn 5 balls from the urn and found that 4 are red and 1 is white. Another individual has drawn 20 balls and found that 12 are red and 8 are white. Which of the two individuals should feel more confident that the urn contains two-thirds red balls and one-third white balls, rather than vice-versa? What odds should each individual give?
I know the right answer, but maybe I don't quite get the odds calculation. Can anyone explain?
Eliezer Yudkowsky has a (really, really long, but good) explanation of Bayes' Theorem. About 70% down, there's a paragraph beginning "In front of you is a bookbag" which explains the core of this problem.
The punchline is that all that matters is the difference between how many red and white balls have been drawn. Thus, contrary to what others have been saying, you don't have to do any calculations. (This is making either of the reasonable assumptions (a) that the balls are drawn with replacement, or (b) the urn has a lot of balls. Then the number of balls doesn't matter.) Here's the argument:
Recall Bayes' theorem: P(A|B) = P(B|A) * P(A) / P(B). (A note on terminology: P(A) is the prior and P(A|B) is the posterior. B is some observation you made, and the terminology reflects your confidence before and after your observation.) This form of the theorem is fine, and @bobince and @Adam Rosenfield correctly applied it. However, using this form directly makes you susceptible to arithmetic errors and it doesn't really convey the heart of Bayes' theorem. Adam mentioned in his post (and I mention above) that all that matters is the difference between how many red and white balls have been drawn, because "everything else cancels out in the equations". How can we see this without doing any calculations?
We can use the concepts of odds ratio and likelihood ratio. What is an odds ratio? Well, instead of thinking about P(A) and P(¬A), we will think about their ratio P(A) : P(¬A). Either is recoverable from the other, but the arithmetic works out nicer with odds ratios because we don't have to normalize. Furthermore, it's easier to "get" Bayes' theorem in its alternate form.
What do I mean we don't have to normalize, and what is the alternate form? Well, let's compute. Bayes' theorem says that the posterior odds are
P(A|B) : P(¬A|B) = (P(B|A) * P(A) / P(B)) : (P(B|¬A) * P(¬A) / P(B)).
The P(B) is a normalizing factor to make the probabilities sum to one; however, we're working with ratios, where 2 : 1 and 4 : 2 odds are the same thing, so the P(B) cancels. We're left with an easy expression which happens to factor:
P(A|B) : P(¬A|B) = (P(B|A) * P(A)) : (P(B|¬A) * P(¬A)) = (P(B|A) : P(B|¬A)) * (P(A) : P(¬A))
We've already heard of the second term there; it's the prior odds ratio. What is P(B|A) : P(B|¬A)? That's called the likelihood ratio. So our final expression is
posterior odds = likelihood ratio * prior odds.
How do we apply it in this situation? Well, suppose we have some prior odds x : y for the contents of the urn, with x representing 2/3rds red and y representing 2/3rds white. Suppose we draw a single red ball. The likelihood ratio is P(drew red ball | urn is 2/3rds red) : P(drew red ball | urn is 2/3rds white) = (2/3) : (1/3) = 2 : 1. So the posterior odds are 2x : y; had we drawn a white ball, the posterior odds would be x : 2y by similar reasoning. Now we do this for every ball in sequence; if the draws are independent, then we just multiply all the odds ratios. So we get that if we start with an odds ratio of x : y and draw r red balls and w white balls, we get a final odds ratio of
(x : y) * (2 : 1)^r * (1 : 2)^w = (x * 2^r) : (y * 2^w) = (x : y) * (2^(r-w) : 1).
so we see that all that matters is the difference between r and w. It also lets us easily solve the problem. For the first question ("who should be more confident?"), the prior odds don't matter, as long as they're not 1 : 0 or 0 : 1 and both people have identical priors. Indeed, if their identical prior was x : y, the first person's posterior would be (2^3 * x) : y, while the second person's posterior would be (2^4 * x) : y, so the second person is more sure.
Suppose moreover that the prior odds were uniform, that is 1 : 1. Then the first person's posterior would be 8 : 1, while the second person's would be 16 : 1. We can easily translate these into probabilities of 8/9 and 16/17, confirming the other calculations.
The point here is that if you get the bolded equation above, then this problem is really easy. But as importantly, you can be sure you didn't mess up any arithmetic, because you have to do so little.
So this is a bad programming question, but it is a good test of the bolded equation. Just for practice, let's apply it to two more problems:
I randomly choose one of two coins, a fair coin or a fake, double-headed coin, each with 50% probability. I flip it three times and it comes up heads all three times. What's the probability it's the real coin?
The prior odds are real : fake = 1 : 1, as stated in the problem. The probability that I would have seen three heads with the real coin is 1 / 8, but it's 1 with the fake coin, so the likelihood ratio is 1 : 8. So the posterior odds are = prior * likelihood = 1 : 8. Thus the probability it's the real coin is 1 / 9.
This problem also brings up an important caveat: there is a possibly different likelihood ratio for every possible observation. This is because the likelihood ratio for B is P(B|A) : P(B|¬A), which is not necessarily related to the likelihood ratio for ¬B, which is P(¬B|A) : P(¬B|¬A). Unfortunately, in all the examples above, they've been inverses of each other, but here, they're not.
Indeed, suppose I flip the coin once and get tails. What's the probability it's the real coin? Obviously one. How does Bayes' theorem check out? Well, the likelihood ratio for this observation is the probability of seeing this outcome with the real coin versus the fake coin, which is 1/2 : 0 = 1 : 0. That is, seeing a single tails kills the probability of the coin's being fake, which checks out with our intuition.
Here's the problem I mentioned from Eliezer's page:
In front of you is a bookbag containing 1,000 poker chips. I started out with two such bookbags, one containing 700 red and 300 blue chips, the other containing 300 red and 700 blue. I flipped a fair coin to determine which bookbag to use, so your prior probability that the bookbag in front of you is the red bookbag is 50%. Now, you sample randomly, with replacement after each chip. In 12 samples, you get 8 reds and 4 blues. What is the probability that this is the predominantly red bag? (You don't need to be exact - a rough estimate is good enough.)
The prior odds are red : blue = 1 : 1. The likelihood ratios are 7 : 3 and 3 : 7, so the posterior odds are (7 : 3)^8 * (3 : 7)^4 = 7^4 : 3^4. At this point we just estimate 7 : 3 as, say, 2 : 1, and get 2^4 : 1 = 16 : 1. Our final answer is even greater, so it's definitely bigger than 95% or so; the right answer is around 96.7%. Compare this with most people's answers, which are in the 70--80% range.
I hope you agree that problems become really easily, and intuitive, when viewed in this light.