. 15
( 18 .)



qk and pj (j, k = 1, 2, 3) satisfying the commutation relations
p j qk ’ qk p j = δjk
and diagonalizes the matrix H = p2 /2m+V (q). Schr¨dinger remarks [30,
p. 46]:
“My theory was inspired by L. de Broglie, Ann. de Physique (10) 3,
p. 22, 1925 (Theses, Paris, 1924), and by brief, yet in¬nitely far-seeing
remarks of A. Einstein, Berl. Ber., 1925, p. 9 et seq. I did not at all
suspect any relation to Heisenberg™s theory at the beginning. I naturally
knew about his theory, but was discouraged, if not repelled, by what
appeared to me as very di¬cult methods of transcendental algebra, and
by the want of perspicuity (Anschaulichkeit).”
The remarkable thing was that where the two theories disagreed with
the old quantum theory of Bohr, they agreed with each other (and with
experiment!). Schr¨dinger quickly discovered the mathematical equiva-
lence of the two theories, based on letting qk correspond to the operator
of multiplication by the coordinate function xk and letting pj correspond
to the operator (¯ /i)‚/‚xj (see the fourth paper in [30]).
Schr¨dinger maintained (and most physicists agree) that the math-
ematical equivalence of two physical theories is not the same as their
physical equivalence, and went on to describe a possible physical inter-
pretation of the wave function ψ. According to this interpretation an
electron with wave function ψ is not a localized particle but a smeared
out distribution of electricity with charge density eρ and electric current
ej, where

h ¯¯
ρ = |ψ|2 , (ψ grad ψ ’ ψ grad ψ).
(The quantities ρ and j determine ψ except for a multiplicative factor
of absolute value one.) This interpretation works very well for a single
electron bound in an atom, provided one neglects the self-repulsion of the
smeared out electron. However, when there are n electrons, ψ is a function
on con¬guration space ‚3n , rather than coordinate space ‚3 , which makes
the interpretation of ψ as a physically real object very di¬cult. Also, for
free electrons ψ, and consequently ρ, spreads out more and more as time
goes on; yet the arrival of electrons at a scintillation screen is always
signaled by a sharply localized ¬‚ash, never by a weak, spread out ¬‚ash.
These objections were made to Schr¨dinger™s theory when he lectured on

it in Copenhagen, and he reputedly said he wished he had never invented
the theory.
The accepted interpretation of the wave function ψ was put forward
by Born [31], and quantum mechanics was given its present form by Dirac
[32] and von Neumann [33]. Let us brie¬‚y describe quantum mechanics,
neglecting superselection rules.
To each physical system there corresponds a Hilbert space H . To
every state (also called pure state) of the system there corresponds an
equivalence class of unit vectors in H , where ψ1 and ψ2 are called equiv-
alent if ψ1 = aψ2 for some complex number a of absolute value one.
(Such an equivalence class, which is a circle, is frequently called a ray.)
The correspondence between states and rays is one-to-one. To each ob-
servable of the system there corresponds a self-adjoint operator, and the
correspondence is again one-to-one. The development of the system in
time is described by a family of unitary operators U (t) on H . There are
two ways of thinking about this. In the Schr¨dinger picture, the state of
the system changes with time”ψ(t) = U (t)ψ0 , where ψ0 is the state at
time 0, and observables do not change with time. In the Heisenberg pic-
ture, observables change with time”A(t) = U (t)’1 A0 U (t), and the state
does not change with time. The two pictures are equivalent, and it is a
matter of convention which is used. For an isolated physical system, the
dynamics is given by U (t) = exp(’(i/¯ )Ht), where H, the Hamiltonian,
is the self-adjoint operator representing the energy of the system.
It may happen that one does not know the state of the physical sys-
tem, but merely that it is in state ψ1 with probability w1 , state ψ2 with
probability w2 , etc., where w1 + w2 + . . . = 1. This is called a mixture
(impure state), and we shall not describe its mathematical representation
The important new notion is that of a superposition of states. Suppose
that we have two states ψ1 and ψ2 . The number |(ψ1 , ψ2 )|2 does not
depend on the choice of representatives of the rays and lies between 0
and 1. Therefore, it can be regarded as a probability. If we know that the
system is in the state ψ1 and we perform an experiment to see whether
or not the system is in the state ψ2 , then |(ψ1 , ψ2 )|2 is the probability of
¬nding that the system is indeed in the state ψ2 . We can write

ψ1 = (ψ2 , ψ1 )ψ2 + (ψ3 , ψ1 )ψ3

where ψ3 is orthogonal to ψ2 . We say that ψ1 is a superposition of the
states ψ2 and ψ3 . Consider the mixture that is in the state ψ2 with

probability |(ψ2 , ψ1 )|2 and in the state ψ3 with probability |(ψ3 , ψ1 )|2 .
Then ψ1 and the mixture have equal probabilities of being found in the
states ψ2 and ψ3 , but they are quite di¬erent. For example, ψ1 has the
probability |(ψ1 , ψ1 )|2 = 1 of being found in the state ψ1 , whereas the
mixture has only the probability |(ψ2 , ψ1 )|4 + |(ψ3 , ψ1 )|4 of being found in
the state ψ1 .
A superposition represents a number of di¬erent possibilities, but un-
like a mixture the di¬erent possibilities can interfere. Thus in the two-slit
experiment, the particle is in a superposition of states of passing through
the top slit and the bottom slit, and the interference of these possibilities
leads to the di¬raction pattern. If we look to see which slit the particle
comes through then the particle will be in a mixture of states of passing
through the top slit and the bottom slit and there will be no di¬raction
If the system is in the state ψ and A is an observable with spectral
projections E» then (ψ, E» ψ) is the probability that if we perform an
experiment to determine the value of A we will obtain a result ¤ ».
Thus (ψ, Aψ) = »(ψ, dE» ψ) is the expected value of A in the state ψ.
(The left hand side is meaningful if ψ is in the domain of A; the integral
on the right hand side converges if ψ is merely in the domain of |A| 2 .)
The observable A has the value » with certainty if and only if ψ is an
eigenvector of A with eigenvalue », Aψ = »ψ.
Thus quantum mechanics di¬ers from classical mechanics in not re-
quiring every observable to have a sharp value in every (pure) state. Fur-
thermore, it is in general impossible to ¬nd a state such that two given
observables have sharp values. Consider the position operator q and the
momentum operator p for a particle with one degree of freedom, and let
ψ be in the domain of p2 , q 2 , pq, and qp. Then (ψ, p2 ψ) ’ (ψ, pψ)2 =
(ψ, p ’ (ψ, pψ) ψ) is the variance of the observable p in the state ψ and
its square root is the standard deviation, which physicists frequently call
the dispersion and denote by ∆p. Similarly for (ψ, q 2 ψ) ’ (ψ, qψ)2 . We
¬nd, using the commutation rule
(pq ’ qp)ψ = ψ, (14.2)

0 ¤ (±q + ip)ψ, (±q + ip)ψ =
±2 (ψ, q 2 ψ) ’ i± ψ, (pq ’ qp)ψ + (ψ, p2 ψ) =
±2 (ψ, q 2 ψ) ’ ±¯ + (ψ, p2 ψ).

Since this is positive for all real ±, the discriminant must be negative,
h2 ’ 4(ψ, q 2 ψ)(ψ, p2 ψ) ¤ 0.
¯ (14.3)
The commutation relation (14.2) continues to hold if we replace p by
p ’ (ψ, pψ) and q by q ’ (ψ, qψ), so (14.3) continues to hold after this
replacement. That is,
∆q∆p ≥ . (14.4)
This is the well-known proof of the Heisenberg uncertainty relation. The
great importance of Heisenberg™s discovery, however, was not the formal
deduction of this relation but the presentation of arguments that showed,
in an endless string of cases, that the relation (14.4) must hold on physical
grounds independently of the formalism.
Thus probabilistic notions are central in quantum mechanics. Given
the state ψ, the observable A can be regarded as a random variable on the
probability space consisting of the real line with the measure (ψ, dE» ψ),
where the E» are the spectral projections of A. Similarly, any number
of commuting self-adjoint operators can be regarded as random variables
on a probability space. (Two self-adjoint operators are said to commute
if their spectral projections commute.) But, and it is this which makes
quantum mechanics so radically di¬erent from classical theories, the set of
all observables of the system in a given state cannot be regarded as a set
of random variables on a probability space. For example, the formalism of
quantum mechanics does not allow the possibility of p and q both having
sharp values even if the putative sharp values are unknown.
For a while it was thought by some that there might be “hidden
variables””that is, a more re¬ned description of the state of a system”
which would allow all observables to have sharp values if a complete de-
scription of the system were known. Von Neumann [33] showed, however,
that any such theory would be a departure from quantum mechanics
rather than an extension of it. It follows from von Neumann™s theorem
that the set of all self-adjoint operators in a given state cannot be re-
garded as a family of random variables on a probability space. Here is
another result along these lines.

THEOREM 14.1 Let A = (A1 , . . . , An ) be an n-tuple of operators on a
Hilbert space H such that for all x in ‚n ,
x · A = x 1 A1 + . . . + x n A n

is essentially self-adjoint. Then either the (A1 , . . . , An ) commute or there
is a ψ in H with ψ = 1 such that there do not exist random variables
±1 , . . . , ±n on a probability space with the property that for all x in ‚n
and » in ‚,

Pr{x · ± ¤ »} = ψ, E» (x · A)ψ

where x · ± = x1 ±1 + . . . + xn ±n and the E» (x · A) are the spectral projec-
tions of the closure of x · A.

In other words, n observables can be regarded as random variables, in
all states, if and only if they commute.

Proof. We shall not distinguish notationally between x · A and its
Suppose that for each unit vector ψ in H there is such an n-tuple ± of
random variables, and let µψ be the probability distribution of ± on ‚n .
That is, for each Borel set B in ‚n , µψ (B) = Pr{± ∈ B}. If we integrate
¬rst over the hyperplanes orthogonal to x, we ¬nd that

ei» d Pr{x · ± ¤ »}
e dµψ (ξ) =
‚ n

ei» ψ, dE» (x · A)ψ = (ψ, eix·A ψ).

Thus the measure µψ is the Fourier transform of (ψ, eix·A ψ). By the
polarization identity, if • and ψ are in H there is a complex measure
µ•ψ such that µ•ψ is the Fourier transform of (•, eix·A ψ) and µψψ = µψ .
For any Borel set B in ‚n there is a unique operator µ(B) such that
•, µ(B)ψ = µ•ψ (B), since µ•ψ depends linearly on ψ and antilinearly
on ψ. Thus we have

eix·ξ •, dµ(ξ)ψ = (•, eix·A ψ).
‚ n

The operator µ(B) is positive since µψ is a positive measure. Conse-
quently, if we have a ¬nite set of elements ψj of H and corresponding

‚n, then
points xj of

ψk , ei(xj ’xk )·A ψj =

ei(xj ’xk )·ξ ψk , dµ(ξ)ψj =
‚ n

ψ(ξ), dµ(ξ)ψ ξ) ≥ 0,
‚ n


eixj ·ξ ψj .
ψ(ξ) =

Furthermore, ei0·A = 1 and ei(’x)·A = (eix·A )— . Under these conditions, the
theorem on unitary dilations of Nagy [34, Appendix, p. 21] implies that
there is a Hilbert space K containing H and a unitary representation
x ’ U (x) of ‚n on K such that, if E is the orthogonal projection of K
onto H , then

EU (x)ψ = eix·A ψ

‚n and all ψ in H . Since eix·A is already unitary,
for all x in

U (x)ψ = eix·A ψ = ψ ,

so that EU (x)ψ = U (x)ψ . Consequently, EU (x)ψ = U (x)ψ and
each U (x) maps H into itself, so that U (x)ψ = eix·A ψ for all ψ in H .
Since x ’ U (x) is a unitary representation of the commutative group ‚n ,


. 15
( 18 .)