qk and pj (j, k = 1, 2, 3) satisfying the commutation relations

h

¯

p j qk ’ qk p j = δjk

i

and diagonalizes the matrix H = p2 /2m+V (q). Schr¨dinger remarks [30,

o

p. 46]:

“My theory was inspired by L. de Broglie, Ann. de Physique (10) 3,

p. 22, 1925 (Theses, Paris, 1924), and by brief, yet in¬nitely far-seeing

remarks of A. Einstein, Berl. Ber., 1925, p. 9 et seq. I did not at all

suspect any relation to Heisenberg™s theory at the beginning. I naturally

knew about his theory, but was discouraged, if not repelled, by what

appeared to me as very di¬cult methods of transcendental algebra, and

by the want of perspicuity (Anschaulichkeit).”

The remarkable thing was that where the two theories disagreed with

the old quantum theory of Bohr, they agreed with each other (and with

experiment!). Schr¨dinger quickly discovered the mathematical equiva-

o

lence of the two theories, based on letting qk correspond to the operator

of multiplication by the coordinate function xk and letting pj correspond

to the operator (¯ /i)‚/‚xj (see the fourth paper in [30]).

h

Schr¨dinger maintained (and most physicists agree) that the math-

o

ematical equivalence of two physical theories is not the same as their

physical equivalence, and went on to describe a possible physical inter-

pretation of the wave function ψ. According to this interpretation an

electron with wave function ψ is not a localized particle but a smeared

out distribution of electricity with charge density eρ and electric current

ej, where

i¯

h ¯¯

ρ = |ψ|2 , (ψ grad ψ ’ ψ grad ψ).

j=

2m

(The quantities ρ and j determine ψ except for a multiplicative factor

of absolute value one.) This interpretation works very well for a single

electron bound in an atom, provided one neglects the self-repulsion of the

smeared out electron. However, when there are n electrons, ψ is a function

on con¬guration space ‚3n , rather than coordinate space ‚3 , which makes

the interpretation of ψ as a physically real object very di¬cult. Also, for

free electrons ψ, and consequently ρ, spreads out more and more as time

goes on; yet the arrival of electrons at a scintillation screen is always

signaled by a sharply localized ¬‚ash, never by a weak, spread out ¬‚ash.

These objections were made to Schr¨dinger™s theory when he lectured on

o

REMARKS ON QUANTUM MECHANICS 93

it in Copenhagen, and he reputedly said he wished he had never invented

the theory.

The accepted interpretation of the wave function ψ was put forward

by Born [31], and quantum mechanics was given its present form by Dirac

[32] and von Neumann [33]. Let us brie¬‚y describe quantum mechanics,

neglecting superselection rules.

To each physical system there corresponds a Hilbert space H . To

every state (also called pure state) of the system there corresponds an

equivalence class of unit vectors in H , where ψ1 and ψ2 are called equiv-

alent if ψ1 = aψ2 for some complex number a of absolute value one.

(Such an equivalence class, which is a circle, is frequently called a ray.)

The correspondence between states and rays is one-to-one. To each ob-

servable of the system there corresponds a self-adjoint operator, and the

correspondence is again one-to-one. The development of the system in

time is described by a family of unitary operators U (t) on H . There are

two ways of thinking about this. In the Schr¨dinger picture, the state of

o

the system changes with time”ψ(t) = U (t)ψ0 , where ψ0 is the state at

time 0, and observables do not change with time. In the Heisenberg pic-

ture, observables change with time”A(t) = U (t)’1 A0 U (t), and the state

does not change with time. The two pictures are equivalent, and it is a

matter of convention which is used. For an isolated physical system, the

dynamics is given by U (t) = exp(’(i/¯ )Ht), where H, the Hamiltonian,

h

is the self-adjoint operator representing the energy of the system.

It may happen that one does not know the state of the physical sys-

tem, but merely that it is in state ψ1 with probability w1 , state ψ2 with

probability w2 , etc., where w1 + w2 + . . . = 1. This is called a mixture

(impure state), and we shall not describe its mathematical representation

further.

The important new notion is that of a superposition of states. Suppose

that we have two states ψ1 and ψ2 . The number |(ψ1 , ψ2 )|2 does not

depend on the choice of representatives of the rays and lies between 0

and 1. Therefore, it can be regarded as a probability. If we know that the

system is in the state ψ1 and we perform an experiment to see whether

or not the system is in the state ψ2 , then |(ψ1 , ψ2 )|2 is the probability of

¬nding that the system is indeed in the state ψ2 . We can write

ψ1 = (ψ2 , ψ1 )ψ2 + (ψ3 , ψ1 )ψ3

where ψ3 is orthogonal to ψ2 . We say that ψ1 is a superposition of the

states ψ2 and ψ3 . Consider the mixture that is in the state ψ2 with

94 CHAPTER 14

probability |(ψ2 , ψ1 )|2 and in the state ψ3 with probability |(ψ3 , ψ1 )|2 .

Then ψ1 and the mixture have equal probabilities of being found in the

states ψ2 and ψ3 , but they are quite di¬erent. For example, ψ1 has the

probability |(ψ1 , ψ1 )|2 = 1 of being found in the state ψ1 , whereas the

mixture has only the probability |(ψ2 , ψ1 )|4 + |(ψ3 , ψ1 )|4 of being found in

the state ψ1 .

A superposition represents a number of di¬erent possibilities, but un-

like a mixture the di¬erent possibilities can interfere. Thus in the two-slit

experiment, the particle is in a superposition of states of passing through

the top slit and the bottom slit, and the interference of these possibilities

leads to the di¬raction pattern. If we look to see which slit the particle

comes through then the particle will be in a mixture of states of passing

through the top slit and the bottom slit and there will be no di¬raction

pattern.

If the system is in the state ψ and A is an observable with spectral

projections E» then (ψ, E» ψ) is the probability that if we perform an

experiment to determine the value of A we will obtain a result ¤ ».

Thus (ψ, Aψ) = »(ψ, dE» ψ) is the expected value of A in the state ψ.

(The left hand side is meaningful if ψ is in the domain of A; the integral

1

on the right hand side converges if ψ is merely in the domain of |A| 2 .)

The observable A has the value » with certainty if and only if ψ is an

eigenvector of A with eigenvalue », Aψ = »ψ.

Thus quantum mechanics di¬ers from classical mechanics in not re-

quiring every observable to have a sharp value in every (pure) state. Fur-

thermore, it is in general impossible to ¬nd a state such that two given

observables have sharp values. Consider the position operator q and the

momentum operator p for a particle with one degree of freedom, and let

ψ be in the domain of p2 , q 2 , pq, and qp. Then (ψ, p2 ψ) ’ (ψ, pψ)2 =

2

(ψ, p ’ (ψ, pψ) ψ) is the variance of the observable p in the state ψ and

its square root is the standard deviation, which physicists frequently call

the dispersion and denote by ∆p. Similarly for (ψ, q 2 ψ) ’ (ψ, qψ)2 . We

¬nd, using the commutation rule

h

¯

(pq ’ qp)ψ = ψ, (14.2)

i

0 ¤ (±q + ip)ψ, (±q + ip)ψ =

±2 (ψ, q 2 ψ) ’ i± ψ, (pq ’ qp)ψ + (ψ, p2 ψ) =

±2 (ψ, q 2 ψ) ’ ±¯ + (ψ, p2 ψ).

h

REMARKS ON QUANTUM MECHANICS 95

Since this is positive for all real ±, the discriminant must be negative,

h2 ’ 4(ψ, q 2 ψ)(ψ, p2 ψ) ¤ 0.

¯ (14.3)

The commutation relation (14.2) continues to hold if we replace p by

p ’ (ψ, pψ) and q by q ’ (ψ, qψ), so (14.3) continues to hold after this

replacement. That is,

h

¯

∆q∆p ≥ . (14.4)

2

This is the well-known proof of the Heisenberg uncertainty relation. The

great importance of Heisenberg™s discovery, however, was not the formal

deduction of this relation but the presentation of arguments that showed,

in an endless string of cases, that the relation (14.4) must hold on physical

grounds independently of the formalism.

Thus probabilistic notions are central in quantum mechanics. Given

the state ψ, the observable A can be regarded as a random variable on the

probability space consisting of the real line with the measure (ψ, dE» ψ),

where the E» are the spectral projections of A. Similarly, any number

of commuting self-adjoint operators can be regarded as random variables

on a probability space. (Two self-adjoint operators are said to commute

if their spectral projections commute.) But, and it is this which makes

quantum mechanics so radically di¬erent from classical theories, the set of

all observables of the system in a given state cannot be regarded as a set

of random variables on a probability space. For example, the formalism of

quantum mechanics does not allow the possibility of p and q both having

sharp values even if the putative sharp values are unknown.

For a while it was thought by some that there might be “hidden

variables””that is, a more re¬ned description of the state of a system”

which would allow all observables to have sharp values if a complete de-

scription of the system were known. Von Neumann [33] showed, however,

that any such theory would be a departure from quantum mechanics

rather than an extension of it. It follows from von Neumann™s theorem

that the set of all self-adjoint operators in a given state cannot be re-

garded as a family of random variables on a probability space. Here is

another result along these lines.

THEOREM 14.1 Let A = (A1 , . . . , An ) be an n-tuple of operators on a

Hilbert space H such that for all x in ‚n ,

x · A = x 1 A1 + . . . + x n A n

96 CHAPTER 14

is essentially self-adjoint. Then either the (A1 , . . . , An ) commute or there

is a ψ in H with ψ = 1 such that there do not exist random variables

±1 , . . . , ±n on a probability space with the property that for all x in ‚n

and » in ‚,

Pr{x · ± ¤ »} = ψ, E» (x · A)ψ

where x · ± = x1 ±1 + . . . + xn ±n and the E» (x · A) are the spectral projec-

tions of the closure of x · A.

In other words, n observables can be regarded as random variables, in

all states, if and only if they commute.

Proof. We shall not distinguish notationally between x · A and its

closure.

Suppose that for each unit vector ψ in H there is such an n-tuple ± of

random variables, and let µψ be the probability distribution of ± on ‚n .

That is, for each Borel set B in ‚n , µψ (B) = Pr{± ∈ B}. If we integrate

¬rst over the hyperplanes orthogonal to x, we ¬nd that

∞

ix·ξ

ei» d Pr{x · ± ¤ »}

e dµψ (ξ) =

‚ n

’∞

∞

ei» ψ, dE» (x · A)ψ = (ψ, eix·A ψ).

=

’∞

Thus the measure µψ is the Fourier transform of (ψ, eix·A ψ). By the

polarization identity, if • and ψ are in H there is a complex measure

µ•ψ such that µ•ψ is the Fourier transform of (•, eix·A ψ) and µψψ = µψ .

For any Borel set B in ‚n there is a unique operator µ(B) such that

•, µ(B)ψ = µ•ψ (B), since µ•ψ depends linearly on ψ and antilinearly

on ψ. Thus we have

eix·ξ •, dµ(ξ)ψ = (•, eix·A ψ).

‚ n

The operator µ(B) is positive since µψ is a positive measure. Conse-

quently, if we have a ¬nite set of elements ψj of H and corresponding

REMARKS ON QUANTUM MECHANICS 97

‚n, then

points xj of

ψk , ei(xj ’xk )·A ψj =

j,k

ei(xj ’xk )·ξ ψk , dµ(ξ)ψj =

‚ n

j,k

ψ(ξ), dµ(ξ)ψ ξ) ≥ 0,

‚ n

where

eixj ·ξ ψj .

ψ(ξ) =

j

Furthermore, ei0·A = 1 and ei(’x)·A = (eix·A )— . Under these conditions, the

theorem on unitary dilations of Nagy [34, Appendix, p. 21] implies that

there is a Hilbert space K containing H and a unitary representation

x ’ U (x) of ‚n on K such that, if E is the orthogonal projection of K

onto H , then

EU (x)ψ = eix·A ψ

‚n and all ψ in H . Since eix·A is already unitary,

for all x in

U (x)ψ = eix·A ψ = ψ ,

so that EU (x)ψ = U (x)ψ . Consequently, EU (x)ψ = U (x)ψ and

each U (x) maps H into itself, so that U (x)ψ = eix·A ψ for all ψ in H .

Since x ’ U (x) is a unitary representation of the commutative group ‚n ,