Friday, October 21, 2016

Why is my cat orange?

One of the students in my Bayesian statistics class, Mafalda Borges, came up with an excellent new Bayes theorem problem.  Here's my paraphrase:
About 3/4 of orange cats are male.  If my cat is orange, what is the probability that his mother was orange?
To answer this question, you have to know a little about the genes that affect coat color in cats:

  • The sex-linked red gene, O, determines whether there will be red variations to fur color. This gene is located on the X chromosome...
  • Males have only one X chromosome, so only have one allele of this gene. O results in orange variations, and o results in non-orange fur.
  • Since females have two X chromosomes, they have two alleles of this gene. OO results in orange toned fur, oo results in non-orange fur, and Oo results in a tortoiseshell cat, in which some parts of the fur are orange variants and others areas non-orange.

If the population genetics for the red gene are in equilibrium, we can use the Hardy-Weinberg principle.  If the prevalence of the red allele is p and the prevalence of the non-red allele is q=1-p:

1)  The fraction of male cats that are orange is p and the fraction that are non-orange is q.

2) The fractions of female cats that are OO, Oo, and oo are p², 2pq, and q², respectively.

Finally, if we know the genetics of a mating pair, we can compute the probability of each genetic combination in their offspring.

1) If the offspring is male, he got a Y chromosome from his father.  Whether he is orange or not depends on which allele he got from his mother:


2) If the offspring is female, her coat depends on both parents:

That's all the background information you need to solve the problem.  I'll post the solution next week.

SOLUTION:

The first step is to use the background information (3/4 of orange cats are male) to find p.  The
fraction of male cats who are orange is p, and the fraction of female cats who are orange is p².  So the ratio of p to p² is 3:1.  Solving for p yields 1/3.

Now we are ready to do the Bayesian update.  We'll use three hypotheses:

1) My cat's mother is orange (OO).
2) His mother is not orange, but heterozygous (Oo).
3) His mother is not orange and homozygous (oo).

For each hypothesis, we can compute the likelihood of the data (my orange cat).

1) If his mother is OO, the likelihood he is orange is 1.
2) If his mother is Oo, the likelihood he is orange is 1/2.
3) If his mother is oo, the likelihood he is orange is 0.

Plugging all this into the "Bayesian update worksheet" looks like this:

So the posterior probability that his mother is orange is 1/3.

As an aside, you could do the update in your head using Bayes's Rule.  After eliminating the third hypothesis, the prior odds ratio is 1:4.  The likelihood ratio is 2, so the posterior odds are 1:2, which corresponds to probability 1/3.  As an exercise for the reader:  what if we learn that one of my cat's littermates is an orange female; now what is the probability that their mother is orange?

Finally, in the interest of full disclosure, I don't really have an orange cat.

No comments:

Post a Comment