Entailment and logical strength

Class handout, © Tim McGrew 2008

 

The concept of entailment gives us a way of describing the relation that holds between two propositions when one is logically stronger than the other. Using the consequence symbol (╞) to show that we intend to consider entailment a semantic notion here, we can write some expressions with a logically stronger statement on the left and a weaker one on the right:

 

(A & B)╞ A

 

P ╞ (P v Q)

 

P ╞ (Q → P)

 

In each case, the rules of any acceptable (consistent and complete) system of propositional logic will permit us to derive the formula on the right from the one on the left.

 

One way of visualizing the fact that (A & B) is riskier than A is to check out the rows of a common truth table and see how many ways there are for each sentence to be true.

 

A

B

T

T

T

F

F

T

F

F

 

The sentence A is true on both of the first two rows of the table; the only ways for it to be false are the third and fourth row. The sentence (A & B), however, is made true only by the assignment of truth values on the first row; it is false on all three of the others. The table displays for us the possible ways for a sentence to be true or false, and there are simply more ways for A to be true, and correspondingly fewer ways for it to be false, than there are for (A & B). And this is not just a matter of counting rows: the set of rows that makes (A & B) true is a proper subset of the set of rows that makes A true. This is another way of displaying the entailment relation.

 

The essential point to notice for our purposes here is that when the relation of entailment holds between two formulas φ and ψ, so that φ ╞ ψ, the value of φ can never be higher than the value of ψ. For suppose that φ is true, that is, has the value 1. Then, because of the entailment relation, ψ must also be true – that is, it must also have the value 1. On the other hand, suppose that φ is false. Then it has the value 0, and no matter what the value of ψ turns out to be, it cannot be lower than zero.

 

When we generalize our valuation scheme from the 1’s and 0’s of deductive logic to real numbers drawn from the interval from 0 to 1, we find that the resulting probabilities still obey this inequality: if φ ╞ ψ, then P(φ) ≤ P(ψ), where P(φ) is the probability of φ, a real number in the interval [0,1]. For example, if (A & B) has the probability .75, then A, which is entailed by (A & B), must have a probability at least as high as .75, perhaps higher. We write this compactly:

If P(A & B) = ,75, then P(A) ≥ .75. This makes good sense epistemically: since there are more things that can go wrong with the stronger statement than with the weaker statement, we want the ordering of the claims to be reflected in the probabilities we assign them.

 

In a moment, we will want to generalize the result we’ve just obtained so that we can say something interesting about the relation between the probabilities of a set of premises (rather than just the single premise φ) and the probability of a conclusion. But first, we need to note an important point about the entailment relation: the logically strongest claim that we can derive from a set of premises φ1, …, φn is their conjunction (φ1 & ... & φn). Because the theorem on probabilities and logical strength pertains, it follows that the probability of any consequence of the set of premises φ1, …, φn will be at least as great as the probability of 1 & ... & φn).

 

Some deductively valid arguments have redundant premises; others (the majority of those we work with in introductory logic courses) need every premise if the entailment is to hold. For the time being, let’s focus on the latter sort and call them simple deductive arguments – understanding that the word “simple” here does not mean that the derivations are necessarily easy, just that they require the use of every premise to go through.

 

One more bit of notation will be useful before we launch out. We need a convenient notation to represent the gap between the probability of a formula φ and the probability of a certainty, the gap between P(φ) and 1. The natural way to do this is to coin an expression – say, the “uncertainty” of φ, written U(φ), and define it as being equal to 1 – P(φ).

 

Now we are ready to write an important theorem, called the Uncertainty Theorem:

 

If φ1, …, φn ╞ ψ is a simple deductively valid argument, then U(ψ) ≤ U(φ1) + … + U(φn)

 

It is easy to apply this to a short deductive argument when we are somewhat uncertain about the premises. For clarity, we will stipulate some probabilities and see how the theorem applies:

 

Sentence          Probability        Uncertainty

  P                         .80                    .20

  (P → Q)             .95                    .05    

  Q                     ≥ .75                 ≤ .25

 

But the theorem also has some surprising and pleasing applications to more difficult or even problematic inferences. Consider the well-known "proof" of an arbitrary formula from the inconsistent premises P and ~P. For the sake of generality, we will not specify a particular number as the probability of P but will use the variable r instead. Then we have:

 

Sentence          Probability        Uncertainty

  P                            r                     1 – r

  ~P                    1 – r                           r  

  Q                       ≥ 0                       ≤ 1

 

Here is an intuitively satisfying result: the derivation of some arbitrary conclusion Q from inconsistent premises, though logically impeccable, gives us in and of itself no epistemic satisfaction since the most we are entitled to say regarding Q on the basis of that derivation is that its probability is either zero or greater – in other words, we learn nothing about Q.

 

Next, consider the derivation of a tautology from an empty set of premises:

 

Conclusion       Probability        Uncertainty

                                                            

  (P v ~P)             ≥ 1                       ≤ 0

 

There are no premises – hence, the sum of the uncertainties of the premises is zero. So if the formula in question is really a consequence of the empty set (which this one is), then its probability must be 1.

 

Finally, consider the so-called Lottery Paradox first investigated by Henry Kyburg. In a fair, closed, 1000-ticket lottery, every ticket is sold and just one winner will be selected by a random drawing method. The probability that any given ticket is a loser is .999, and the assertion LTn (“Ticket #n is a loser”) therefore has this probability for anyone who knows nothing relevant to its winning but the conditions of the lottery. Now consider the following deductively valid argument:

 

Sentence                      Probability        Uncertainty

 

  LT1                                 .999                 .001

  LT2                                .999                 .001

    .                                       .                       .

    .                                       .                       .

    .                                       .                       .

  LT1000                             .999                 .001

  (LT1&…&LT1000)           ≥ 0                   ≤ 1

 

The sum of the uncertainties of all 1000 premises is 1, so on the basis of this argument we are entitled to say nothing significant about the conclusion. Once again, this is intuitively right; for given the conditions of the lottery, we know that the conclusion must be false. Even so, the Lottery Paradox is a striking illustration of the fact a valid deductive argument with premises that are nearly certain may have a conclusion that is certainly false.