From Truth Tables to Joint Probability Distributions
In elementary symbolic logic we learn how to display the truth values of complex sentences as a function of truth values of their simple components. Since each atomic sentence of the propositional calculus must take exactly one of the values T or F, and since the calculus is truth-functionally complete, the selection of any row of the table determines the truth value of a compound sentence built up out of just those atomic constituents. Thus, we might write:
|
A |
B |
(A v ~B) |
|
T |
T |
T |
|
T |
F |
T |
|
F |
T |
F |
|
F |
F |
T |
When we turn to the probability calculus, we are still dealing with propositions, but the possible values are now more complex. We may stipulate that A is true, that it is false, or that it has some probability p, 0 # p # 1. Moreover, the probability calculus gives us a new sort of connection between propositions B conditional probability B which cannot be flattened out into ordinary material conditionals. But an extension of the truth table device is still valuable and clarifies the probabilistic relationships among propositions. Here is a simple example:
|
A |
B |
P(A, B) |
|
|
T |
T |
.38 |
= P(A&B) |
|
T |
F |
.00 |
= P(A&~B) |
|
F |
T |
.29 |
= P(~A&B) |
|
F |
F |
.33 |
= P(~A&~B) |
|
|
|
1.00 |
|
The first two columns give us the possible combinations of truth values for the propositional variables A and B, which we italicize to indicate that they are variables rather than assertions. The third column gives us a joint distribution of probabilities across the possible values of those variables. The final column indicates in a more familiar notation the probabilities involved.
Two features of this particular joint distribution are worth noting. First and most importantly, the row probabilities must sum to 1. This makes sense: the rows of a truth table, taken as a set, exhaust the logically possible options but are mutually exclusive B they form a partition. So the sum of the row probabilities must be 1 in any coherent joint distribution. Second, the probability where A is true but B is false is 0. This suggests that A entails B, though this is not absolutely guaranteed by the numbers alone. (It could be that A&~B picks out a non-empty set of measure zero, which isn't quite the same thing.)
Sometimes when there are only two variables the same information is organized in a different way, like this:
|
|
A |
~A |
|
B |
.38 |
.29 |
|
~B |
.00 |
.33 |
This is just a rearrangement; there is no deeper significance to this way of writing it. Since this format is not easily extended to three or more variables, we=ll stick in what follows with the standard truth table format.
Given a joint probability distribution we can calculate the probabilities of individual events by adding the probabilities for the appropriate lines. Here, P(B) = .67 and P(A) = .38, and these two numbers do not have to sum to 1 (though by accident they might have done so). But some of the numbers are linked: P(~B) = 1 B P(B) = .33, and P(~A) = 1 B P(A) = .62.
Suppose that we were to learn that B is true B not to learn that it is probable, but to learn its truth in some unproblematic fashion. This piece of information effectively eliminates the two rows of the joint distribution where B is false, leaving us with the following shortened table:
|
A |
B |
P(A, B) |
|
|
T |
T |
.38 |
= P(A&B) |
|
F |
T |
.29 |
= P(~A&B) |
|
|
|
.67 |
|
Since the old probabilities in the remaining rows sum to .67 rather than to 1, we need to renormalize, dividing those probabilities by .67, in order to get a probability distribution. The new numbers are (approximately):
|
A |
B |
P(A, B) |
|
|
T |
T |
.567 |
= P(A&B) |
|
F |
T |
.433 |
= P(~A&B) |
|
|
|
1.000 |
|
Notice that the two rows exhaust the ways for B to be true. So P(B) = P(A&B) + P(~A&B). This is what we would expect if (A&B) is incompatible with (~A&B), which it is. From logic we know that B is equivalent to the disjunction ((A&B) v (~A&B)), both syntactically (each formula can be derived from the other in standard propositional logic) and semantically (they share the same truth table). By equivalence (Hacking, p. 58) they must therefore have the same probability.