incontrovertible evidence that the intermediate conclusion follows from things

already established. Here, the logician™s standards of rigor are extreme. It is demand for rigor

not enough to show that each step in a purported proof almost certainly

follows from the ones that come before. That may be good enough for getting

around in our daily life, but it is not good enough if our aim is to demonstrate

that S must be true provided P1 , . . . , Pn are all true.

There is a practical reason for this demand for rigor. In ordinary life,

we frequently reason in a step-by-step fashion, without requiring absolute

certainty at each step. For most purposes, this is ¬ne, since our everyday

“proofs” generally traverse only a small number of intermediate conclusions.

But in many types of reasoning, this is not the case.

Think of what you did in high school geometry. First you started with a

small number of axioms that stated the basic premises of Euclidean geometry.

You then began to prove conclusions, called theorems, from these axioms. As

Section 2.2

48 / The Logic of Atomic Sentences

you went on to prove more interesting theorems, your proofs would cite earlier

theorems. These earlier theorems were treated as intermediate conclusions in

justifying the new results. What this means is that the complete proofs of

the later theorems really include the proofs of the earlier theorems that they

presuppose. Thus, if they were written out in full, they would contain hundreds

or perhaps thousands of steps. Now suppose we only insisted that each step

show with probability .99 that the conclusion follows from the premises. Then

each step in such a proof would be a pretty good bet, but given a long enough

proof, the proof would carry virtually no weight at all about the truth of the

conclusion.

This demand for certainty becomes even more important in proofs done by

computers. Nowadays, theorems are sometimes proven by computers, and the

proofs can be millions of steps long. If we allowed even the slightest uncertainty

in the individual steps, then this uncertainty would multiply until the alleged

“proof” made the truth of the conclusion no more likely than its falsity.

Each time we introduce new types of expressions into our language, we will

discuss new methods of proof supported by those expressions. We begin by

methods of proof

discussing the main informal methods of proof used in mathematics, science,

and everyday life, emphasizing the more important methods like indirect and

conditional proof. Following this discussion we will “formalize” the methods

by incorporating them into what we call a formal system of deduction. A

formal systems

formal system of deduction uses a ¬xed set of rules specifying what counts as

an acceptable step in a proof.

The di¬erence between an informal proof and a formal proof is not one of

rigor, but of style. An informal proof of the sort used by mathematicians is

informal proofs

every bit as rigorous as a formal proof. But it is stated in English and is usu-

ally more free-wheeling, leaving out the more obvious steps. For example, we

could present our earlier argument about Socrates in the form of the following

informal proof:

Proof: Since Socrates is a man and all men are mortal, it follows

that Socrates is mortal. But all mortals will eventually die, since

that is what it means to be mortal. So Socrates will eventually die.

But we are given that everyone who will eventually die sometimes

worries about it. Hence Socrates sometimes worries about dying.

A formal proof, by contrast, employs a ¬xed stock of rules and a highly styl-

formal proofs

ized method of presentation. For example, the simple argument from Cube(c)

and c = b to Cube(b) discussed in the last section will, in our formal system,

take the following form:

Chapter 2

Methods of proof / 49

1. Cube(c)

2. c = b

3. Cube(b) = Elim: 1, 2

As you can see, we use an extension of the Fitch format as a way of presenting

formal proofs. The main di¬erence is that a formal proof will usually have more

than one step following the Fitch bar (though not in this example), and each

of these steps will be justi¬ed by citing a rule of the formal system. We will

explain later the various conventions used in formal proofs.

In the course of this book you will learn how to give both informal and formal vs. informal

proofs

formal proofs. We do not want to give the impression that formal proofs are

somehow better than informal proofs. On the contrary, for purposes of proving

things for ourselves, or communicating proofs to others, informal methods are

usually preferable. Formal proofs come into their own in two ways. One is that

they display the logical structure of a proof in a form that can be mechanically

checked. There are advantages to this, if you are a logic teacher grading lots

of homework, a computer, or not inclined to think for some other reason. The

other is that they allow us to prove things about provability itself, such as

G¨del™s Completeness Theorem and Incompleteness Theorems, discussed in

o

the ¬nal section of this book.

Remember

1. A proof of a statement S from premises P1 , . . . , Pn is a step-by-step

demonstration which shows that S must be true in any circumstances

in which the premises P1 , . . . , Pn are all true.

2. Informal and formal proofs di¬er in style, not in rigor.

Proofs involving the identity symbol

We have already seen one example of an important method of proof. If we can

prove, from whatever our premises happen to be, that b = c, then we know

that anything that is true of b is also true of c. After all, b is c. In philosophy,

this simple observation sometimes goes by the fancy name the indiscernibility indiscernibility of

identicals

of identicals and sometimes by the less pretentious name substitution. Shake-

speare no doubt had this principle in mind when he wrote “A rose, by any

other name, would smell as sweet.”

We will call the formal rule corresponding to this principle Identity Elimi- identity elimination

Section 2.2

50 / The Logic of Atomic Sentences

nation, abbreviated = Elim. The reason for this name is that an application

of this rule “eliminates” a use of the identity symbol when we move from the

premises of the argument to its conclusion. We will have another rule that

introduces the identity symbol.

The principle of identity elimination is used repeatedly in mathematics.

For example, the following derivation uses the principle in conjunction with

the well-known algebraic identity x2 ’ 1 = (x ’ 1)(x + 1):

x2 > x2 ’ 1

so

x2 > (x ’ 1)(x + 1)

We are all familiar with reasoning that uses such substitutions repeatedly.

Another principle, so simple that one often overlooks it, is the so-called

re¬‚exivity of identity. The formal rule corresponding to it is called Identity

re¬‚exivity of identity or

identity elimination Introduction, or = Intro, since it allows us to introduce identity statements

into proofs. It tells us that any sentence of the form a = a can be validly

inferred from whatever premises are at hand, or from no premises at all. This

is because of the assumption made in fol that names always refer to one and

only one object. This is not true about English, as we have noted before. But

it is in fol, which means that in a proof you can always take any name a

that is in use and assert a = a, if it suits your purpose for some reason. (As a

matter of fact, it is rarely of much use.) Gertrude Stein was surely referring

to this principle when she observed “A rose is a rose is a rose.”

Another principle, a bit more useful, is that of the symmetry of identity. It

symmetry of identity

allows us to conclude b = a from a = b. Actually, if we wanted, we could derive

this as a consequence of our ¬rst two principles, by means of the following

proof.

Proof: Suppose that a = b. We know that a = a, by the re¬‚exivity

of identity. Now substitute the name b for the ¬rst use of the name

a in a = a, using the indiscernibility of identicals. We come up with

b = a, as desired.

The previous paragraph is another example of an informal proof. In an

informal proof, we often begin by stating the premises or assumptions of the

proof, and then explain in a step-by-step fashion how we can get from these

assumptions to the desired conclusion. There are no strict rules about how

detailed the explanation needs to be. This depends on the sophistication of

the intended audience for the proof. But each step must be phrased in clear

and unambiguous English, and the validity of the step must be apparent. In

the next section, we will see how to formalize the above proof.

Chapter 2

Methods of proof / 51

A third principle about identity that bears noting is its so-called transitivity. transitivity of identity

If a = b and b = c are both true, then so is a = c. This is so obvious that there

is no particular need to prove it, but it can be proved using the indiscernibility

of identicals. (See Exercise 2.5.)

If you are using a language that contains function symbols (introduced

in the optional Section 1.5), the identity principles we™ve discussed also hold

for complex terms built up using function symbols. For example, if you know

that Happy(john) and john = father(max), you can use identity elimination to

conclude Happy(father(max)), even though father(max) is a complex term, not

a name. In fact, the example where we substituted (x ’ 1)(x + 1) for x2 ’ 1

also applied the indiscernibility of identicals to complex terms.

Remember

There are four important principles that hold of the identity relation:

1. = Elim: If b = c, then whatever holds of b holds of c. This is also

known as the indiscernibility of identicals.

2. = Intro: Sentences of the form b = b are always true (in fol). This

is also known as the re¬‚exivity of identity.

3. Symmetry of Identity: If b = c, then c = b.

4. Transitivity of Identity: If a = b and b = c, then a = c.

The latter two principles follow from the ¬rst two.

Proofs involving other predicates and relations

Sometimes there will be logical dependencies among the predicate symbols in

a ¬rst-order language, dependencies similar to those just discussed involving

the identity symbol. This is the case, for example, in the blocks language.

When this is so, proofs may need to exploit such relationships. For example,

the sentence Larger(a, c) is a consequence of Larger(a, b) and Larger(b, c). This

is because the larger-than relation, like the identity relation, is transitive. It is other transitive

relations

because of this that any world where the latter two sentences are true will also

be one in which the ¬rst is true. Similar examples are given in the problems.

Another example of this sort that is used frequently in mathematics in-

volves the transitivity of the less-than relation. You frequently encounter

Section 2.2

52 / The Logic of Atomic Sentences

proofs written in the following form:

k1 < k2

k2 < k3

k3 < k4

so

k1 < k4

This proof contains two implicit uses of the transitivity of <.

There is no way to catalog all the legitimate inferences involving predicate

and relation symbols in all the languages we might have occasion to deal with.

But the example of identity gives us a few things to look for. Many relations

besides identity are transitive: larger than and less than are just two examples.

And many are re¬‚exive and/or symmetric: being the same size as and being in

re¬‚exive and symmetric

relations the same row as are both. But you will run across other logical dependencies

that don™t fall under these headings. For instance, you might be told that b

is larger than c and want to infer that c is smaller than b. This holds because

larger than and smaller than are what is known as “inverses”: they refer to

inverse relations

the same relation but point, so to speak, in opposite directions. Usually, you

will have no trouble spotting the logical dependencies among predicates, but

in giving a proof, you need to make explicit the ones you are assuming.

Let™s look at one ¬nal example before trying our hand at some exercises.

Suppose we were asked to give an informal proof of the following argument:

RightOf(b, c)

LeftOf(d, e)

b=d

LeftOf(c, e)

Our informal proof might run like this:

Proof: We are told that b is to the right of c. So c must be to the

left of b, since right of and left of are inverses of one another. And

since b = d, c is left of d, by the indiscernibility of identicals. But we

are also told that d is left of e, and consequently c is to the left of e,

by the transitivity of left of. This is our desired conclusion.

Chapter 2