 # When Is This Expression True?

15 Nov 2013, 13:14 UTC

Probability theory can be confusing sometimes. I think this confusion comes from a belief that one has intuitive insight into how it works, when one really doesn't. Counter-intuitive results abound. But worse is results that seem "obvious" but which are only true in some cases.

A good example is the following expression, which I simply wrote down one night because it was "obvious":

``````p(A|C) = p(A|B)p(B|C) + p(A|!B)p(!B|C)
``````

Being a good boy, I decided not to use this expression before I proved it. I failed to prove it. I pushed symbols around for an hour and got nowhere. So I posted about it on Facebook and got no less than six mathematicians involved (five of them Cambridge graduates; one of them even a Cambridge PhD!) And we were all fairly stumped!

I promised those guys (Dave C, Dave H, Graham, Cedrick, Willem) I'd write up the conclusions, so here you are.

## Background

A, B and C are boolean-valued random variables. Since they are boolean-valued we can draw them using a Venn diagram. (This turns out not to help very much.)

We can also write a truth table for A,B,C as follows, and label the primal probabilities:

``````A B C p
0 0 0 a
0 0 1 b
0 1 0 c
0 1 1 d
1 0 0 e
1 0 1 f
1 1 0 g
1 1 1 h
``````

So p(A&B&C)=h, p(!A&!B&!C)=a, etc.

And a quick recap of conditional probabilities: by definition,

``````p(A|B) = p(A&B)/p(B)
``````

## Easy examples and counterexamples

In the Venn diagram, if A is enclosed in B, then p(A|!B) = 0, so the expression reduces to

``````p(A|C) = p(A|B)p(B|C)
``````

If B is also enclosed in C then this represents an algebraic truth about the ratios of the areas in the Venn diagram. Area(A)/Area(C) = Area(A)/Area(B) * Area(B)/Area(C).

So A in B in C, equivalently p(A|!B) = p(B|!C) = 0, is an example where the expression definitely works. Unfortunately not much else can be deduced just by looking at the Venn diagram (C in B in A also works, by similar logic).

But the expression certainly does not always hold. A counterexample is if B and C are independent, that is, p(B|C) = p(B) and p(!B|C) = p(!B). Then the expression reduces to:

``````p(A|C) = p(A|B)p(B|C) + p(A|!B)p(!B|C)
= p(A|B)p(B) + p(A|!B)p(!B)
= p(A&B) + p(A&!B)
= p(A)
``````

But this states that A and C are independent. Since this does not hold in general, the expression fails to work in this case.

## General constraint

Dave H managed to do some algebraic gymnastics on the full expression, using the primal truth table probabilities. This goes something like this:

``````p(A|C) = (f+h)/(b+d+f+h)
p(A|B) = (g+h)/(c+d+g+h)
p(A|!B) = (e+f)/(a+b+e+f)
p(B|C) = (d+g)/(b+d+f+h)
p(!B|C) = (b+f)/(b+d+f+h)

Expression is therefore:
(f+h)/(b+d+f+h) = (g+h)/(c+d+g+h).(d+g)/(b+d+f+h) + (e+f)/(a+b+e+f).(b+f)/(b+d+f+h)

Cancel b+d+f+h on both sides and raise denominators:
(f+h)(c+d+g+h)(a+b+e+f) = (g+h)(d+h)(a+b+e+f) + (e+f)(b+f)(c+d+g+h)

Combine products of (a+b+e+f):
0 = [(g+h)(d+g) - (f+g)(c+d+g+h)](a+b+e+f) + (e+f)(b+f)(c+d+g+h)
= [gd+hd+gg+hg - fc-fd-fg-fh-gc-gd-gg-gh](a+b+e+f) + (e+f)(b+f)(c+d+g+h)
= [gd-hc - f(c+d+g+h)](a+b+e+f) + (e+f)(b+f)(c+d+g+h)

Combine products of (c+d+g+h):
0 = (gd-hc)(a+b+e+f) + [(e+f)(b+f) - f(a+b+e+f)](c+d+g+h)
= (gd-hc)(a+b+e+f) + [eb+fb+ef+ff - fa-fb-fe-ff](c+d+g+h)
= (gd-hc)(a+b+e+f) + (eb-fa)(c+d+g+h)

Rearrange:
(gd-hc)/(c+d+g+h) = (fa-eb)/(a+b+e+f)
``````

We can replace symbols as follows:

``````p(A&B&!C)p(!A&B&C) - p(A&B&C)p(!A&B&!C)   p(!A&!B&!C)p(A&!B&C) - p(!A&!B&C)p(A&!B&!C)
--------------------------------------- = -------------------------------------------
p(B)                                       p(!B)
``````

This is also equivalent to:

``````p(A&!C|B)p(!A&C|B) - p(A&C|B)p(!A&!C|B) = p(!A&!C|!B)p(A&C|!B) - p(!A&C|!B)p(A&!C|!B)
``````

It should be noted that if you exchange A<->!A, B<->!B and/or C<->!C then the expression remains unchanged (except for a possible change of sign). This implies that the original expression holds not only for p(A|C) but also p(A|!C), p(!A|C) and p(!A|!C) (by exchanging variables with their negations). This is what we would hope/expect.

## Now what?

What does this constraint mean? It seems like some kind of residual, conditional on B, has to be equal to some kind of residual, conditional on !B. The residual appears to be some kind of cross-product term. If we consider (a,b) and (e,f) as vectors, the residual will be zero if those vectors are parallel (i.e. e = k.a and f = k.b for some constant k), and our expression will hold. If those vectors are non-parallel, then things are trickier if we want our expression to hold.

For the case where residual is zero we can introduce constants k and m, set e=ka, f=kb, g=mc,h=md, and the truth table is:

``````A B C p
0 0 0 a
0 0 1 b
0 1 0 c
0 1 1 d
1 0 0 ka
1 0 1 kb
1 1 0 mc
1 1 1 md
``````

Keeping in mind that we must have (ka+kb+mc+md) = 1-(a+b+c+d) we get that:

``````(k+1)(a+b) + (m+1)(c+d) = 1
=> m = [1 - (k+1)(a+b)]/(c+d) - 1
``````

So that, given a,b,c,d and k, m is already determined.

## Transition Matrix Solution

Another way to get a solution with residual zero is as follows:

``````A B C p
0 0 0 (1-x)ac
0 0 1 x(1-b)c
0 1 0 (1-x)(1-a)(1-d)
0 1 1 xb(1-d)
1 0 0 (1-x)a(1-c)
1 0 1 x(1-b)(1-c)
1 1 0 (1-x)(1-a)d
1 1 1 xbd
``````

This table sums to 1 by construction, and has (a,b) || (e,f) and (c,d) || (g,h) also by construction.

These figures represent a process whereby C is picked at random (p(C) = x, p(!C) = 1-x), then B copies C, possibly incorrectly:

``````p(!B|!C) = a (true negative)
p(B|!C) = (1-a) (false positive)
p(!B|C) = (1-b) (false negative)
p(B|C) = b (true positive)
``````

then A copies B, possibly incorrectly:

``````p(!A|!B) = c (true negative)
p(A|!B) = (1-c) (false positive)
p(!A|B) = (1-d) (false negative)
p(A|B) = d (true positive)
``````

The process of picking C is independent of the process for adding errors to B is independent of the process for adding errors to A. The truth table above is the final result, and it can be verified that it obeys the original expression.

## Family of Other Solutions

So we've pretty much covered all the solutions with zero residual. This family of solutions corresponds to A being related to B via a transition matrix, and B related to C via a transition matrix. Then the original expression (plus its permutations) says that A is related to C by the product of the two transition matrices.

If we set the transition matrices to special cases we retrieve the easy geometric cases "A in B in C" (no false positives, some false negatives) and "C in B in A" (no false negatives, some false positives).

But ... our solution has 5 degrees of freedom (DOF). The full truth table has 7DOF (eight variables which must sum to exactly 1). The constraint Dave H worked out reduces this to 6DOF. But out solution has one DOF fewer because we force the residual to be zero.

There is therefore an entire 6DOF family of solutions with non-zero residual. But it's not clear if this family is a large set of coincidences, or if it has a meaningful interpretation.

Over to you, the reader ...

The only thing keeping my cat from eating me is the dicnfreefe in our respective sizes and the fact that I feed her on a regular basis. We both know that if either of those changes, all bets are off. commented:

The only thing keeping my cat from eating me is the dicnfreefe in our respective sizes and the fact that I feed her on a regular basis. We both know that if either of those changes, all bets are off.

on 18 Oct 2016, 12:29 UTC

Boy that <a href="http://ltawwdp.com">rellay</a> helps me the heck out. commented:

Boy that <a href="http://ltawwdp.com">rellay</a> helps me the heck out.

on 19 Oct 2016, 17:41 UTC

on 19 Oct 2016, 23:57 UTC

I'm not <a href="http://mvajyopmkqu.com">worhty</a> to be in the same forum. ROTFL commented:

I'm not <a href="http://mvajyopmkqu.com">worhty</a> to be in the same forum. ROTFL

on 22 Oct 2016, 05:45 UTC

on 24 Oct 2016, 20:54 UTC

JKf77A http://www.FyLitCl7Pf7ojQdDUOLQOuaxTXbj5iNG.com commented:

JKf77A http://www.FyLitCl7Pf7ojQdDUOLQOuaxTXbj5iNG.com

on 02 Jan 2017, 08:44 UTC

on 10 Jan 2017, 12:18 UTC

on 10 Jan 2017, 12:18 UTC

on 10 Jan 2017, 14:45 UTC

on 10 Jan 2017, 17:09 UTC

on 10 Jan 2017, 20:36 UTC

on 11 Jan 2017, 05:21 UTC

on 11 Jan 2017, 07:41 UTC

on 11 Jan 2017, 08:35 UTC

on 11 Jan 2017, 10:02 UTC

on 23 Jan 2017, 16:08 UTC

on 23 Jan 2017, 18:31 UTC

on 23 Jan 2017, 20:49 UTC

on 23 Jan 2017, 23:16 UTC

on 24 Jan 2017, 01:40 UTC

on 24 Jan 2017, 04:09 UTC

on 24 Jan 2017, 07:22 UTC

on 24 Jan 2017, 09:51 UTC

on 24 Jan 2017, 12:32 UTC

on 25 Jan 2017, 16:00 UTC

on 28 Jan 2017, 21:06 UTC

DRxVxn http://www.y7YwKx7Pm6OnyJvolbcwrWdoEnRF29pb.com commented:

DRxVxn http://www.y7YwKx7Pm6OnyJvolbcwrWdoEnRF29pb.com

on 29 Jan 2017, 15:19 UTC

5I1d6u http://www.y7YwKx7Pm6OnyJvolbcwrWdoEnRF29pb.com commented:

5I1d6u http://www.y7YwKx7Pm6OnyJvolbcwrWdoEnRF29pb.com

on 29 Jan 2017, 15:21 UTC

v6Kn5d http://www.y7YwKx7Pm6OnyJvolbcwrWdoEnRF29pb.com commented:

v6Kn5d http://www.y7YwKx7Pm6OnyJvolbcwrWdoEnRF29pb.com

on 31 Jan 2017, 17:39 UTC