Marilyn vos Savant writes a weekly column called “Ask Marilyn” for Parade magazine. In her column of August 18, 2013, she proposed the following problem. Five cats are in a sack: two are tabbies and three are calicos. You let one cat out of the bag, but it runs up a tree before you get a chance to see its color. Then you purposely let out another cat and observe it to be a tabby. What is the probability that the cat in the tree is also a tabby? Marilyn’s answer is 1 chance in 4.

Many readers wrote to Marilyn, arguing that the correct answer should be 2 out of 5 and that observing the one tabby does not change the probability. But this is incorrect; the probability in Marilyn’s question is a simple example of conditional probability, as follows. Initially, it is a given (known fact, probability = 1) that the ratio of tabbies to calicos is 2/5. Because there were only two tabbies in the bag initially, and you purposely let one cat (that happened to be a tabby) out of the bag, one of the four remaining cats of unknown color (three in the bag and one in the tree) could be a tabby. Thus, the probability that the cat in the tree is a tabby is 1 in 4.

It is important to realize when additional information is relevant or irrelevant to the statistical analysis of model populations. For example, all genetically normal cats with calico color pattern are female. This is additional information, but it is irrelevant to the solution of Marilyn’s problem. But what if we had been told that tabbies are 50% more likely to escape from the bag than calicos? In other words, cats do not tend to leave the bag in random order with respect to color. Technically, the naive answers above, based on fractions, assume that all cats are equally likely to escape. Equally likely outcomes are vital to using simple fractions for probability. If, on the other hand, we knew that tabbies and calicos have different natural probabilities of escaping, we would apply Bayes’s rule to determine probabilities, including the chance that the cat up a tree is a tabby, before a second cat gets out of the bag. Computing afterward that the conditional probability that the cat up a tree is a tabby, given that the cat we can observe is a tabby, gets rather complicated mathematically. We explained the rudiments of Bayes’s rule in a previous issue of ABT (Stansfield & Carlton, 2004), and we pointed out in another previous issue (Stansfield & Carlton, 2011) that all assumptions underlying a statistical solution should be identified along with consideration of alternative explanations when analyzing model populations in published papers.

Parade magazine is distributed in more than 640 newspapers in the United States. It is the most widely read magazine in the country, with a circulation of 32.5 million and a readership of nearly 60 million. We receive our copy as an addition to the Sunday issue of our local newspaper. We believe that Marilyn’s cat problem has been read by far more people than could be reached via any scientific journal or textbook. So it is no longer a trivial example of biostatistics in the popular press. We pointed this out in a previous paper (Stansfield & Carlton, 2009) dealing with the sex distribution in two-child families from a federal survey. If Marilyn’s cat problem is available nationwide to so many people, surely it is worth discussion in high school biology classes.

References

References
Stansfield, W.D. & Carlton, M.A. (2004). Bayesian statistics for biological data: pedigree analysis. American Biology Teacher, 66, 177–182.
Stansfield, W.D. & Carlton, M.A. (2009). The most widely publicized gender problem in human genetics. Human Biology, 81, 3–11.
Stansfield, W.D. & Carlton, M.A. (2011). The truth about models: how well do mechanical models mimic the observed distributions in two-child families? American Biology Teacher, 73, 213–216.