. 13
( 18 .)


(S3). The conditions (R3) and (S2) hold and det σ— (t) > 0 a.e. for
a.e. t.

We obtain theorems analogous to the preceding ones. In particular, if
a ¤ b, a ∈ I, b ∈ I, then for an (S1) process
E{x(b) ’ x(a) | Fb } = E D— x(s) ds Fb , (11.11)

and for an (S2) process
E{[y— (b) ’ y— (a)] | Fb } = E σ— (s) ds Fb .
2 2

THEOREM 11.10 Let x be an (S1) process. Then

EDx(t) = ED— x(t) (11.13)

for all t in I. Let x be an (S2) process. Then

Eσ 2 (t) = Eσ— (t)

for all t in I.

Proof. By Theorem 11.1 and (11.11), if we take absolute expectations
we ¬nd
b b
E[x(b) ’ x(a)] = E Dx(s) ds = E D— x(s) ds
a a

for all a and b in I. Since s ’ Dx(s) and s ’ D— x(s) are continuous
in L 1 , (11.13) holds. Similarly, (11.14) follows from Theorem 11.4 and
(11.12). QED.

THEOREM 11.11 Let x be an (S1) process. Then x is a constant (i.e.,
x(t) is the same random variable for all t) if and only if Dx = D— x = 0.

Proof. The only if part of the theorem is trivial. Suppose that Dx =
D— x = 0. By Theorem 11.2, x is a martingale and a martingale with the
direction of time reversed. Let t1 = t2 , x1 = x(t1 ), x2 = x(t2 ). Then x1
and x2 are in L 1 and E{x1 |x2 } = x2 , E{x2 |x1 } = x1 . We wish to show
that x1 = x2 (a.e., of course).
If x1 and x2 are in L 2 (as they are if x is an (S2) process) there is a
trivial proof, as follows. We have

E{(x2 ’ x1 )2 | x1 } = E{x2 ’ 2x2 x1 + x2 | x1 } = E{x2 | x1 } ’ x2 ,
2 1 2 1

so that if we take absolute expectations we ¬nd

E(x2 ’ x1 )2 = Ex2 ’ Ex2 .
2 1

The same result holds with x1 and x2 interchanged. Thus E(x2 ’x1 )2 = 0,
x2 = x1 a.e.
G. A. Hunt showed me the following proof for the general case (x1 , x2
in L 1 ).
Let µ be the distribution of x1 , x2 in the plane. We can take x1 and
x2 to be the coordinate functions. Then there is a conditional probability
distribution p(x1 , ·) such that if ν is the distribution of x1 and f is a
positive Baire function on ‚2 ,

f (x1 , x2 ) dµ(x1 , x2 ) = f (x1 , x2 ) p(x1 , x2 ) dν(x1 ).

(See Doob [15, §6, pp. 26“34].) Then

E{•(x2 ) | x1 } = •(x2 ) p(x1 , dx2 ) a.e. [ν]

provided •(x2 ) is in L 1 . Take • to be strictly convex with |•(ξ)| ¤ |ξ|
for all real ξ (so that •(x2 ) is in L 1 ). Then, for each x1 , since • is strictly
convex, Jensen™s inequality gives

• x2 p(x1 , dx2 ) < •(x2 ) p(x1 , dx2 )

•(x2 ) p(x1 , dx2 ) a.e. [p(x1 , ·)]. But
unless •(x1 ) =

x2 p(x1 , dx2 ) = x1 a.e. [ν],

so, unless x2 = x1 a.e. [ν],

•(x1 ) < •(x2 ) p(x1 , dx2 ).

If we take absolute expectations, we ¬nd E•(x1 ) < E•(x2 ) unless x2 = x1
a.e. The same argument gives the reverse inequality, so x2 = x1 a.e.

THEOREM 11.12 Let x be and y be (S1) processes with respect to the
same families of σ-algebras Pt and Ft , and suppose that x(t), y(t), Dx(t),
Dy(t), D— x(t), and D— y(t) all lie in L 2 and are continuous functions of
t in L 2 . Then
Ex(t)y(t) = EDx(t) · y(t) + Ex(t)D— y(t).

Proof. We need to show for a and b in I, that
E [x(b)y(b) ’ x(a)y(a)] = E [Dx(t) · y(t) + x(t)D— y(t)]dt.

(Notice that the integrand is continuous.) Divide [a, b] into n equal parts:
tj = a + j(b ’ a)/n for j = 0, . . . , n. Then
E [x(b)y(b) ’ x(a)y(a)] = lim E [x(tj+1 )y(tj ) ’ x(tj )y(tj’1 )] =
y(tj ) + y(tj’1 )
E x(tj+1 ) ’ x(tj )
lim +

x(tj+1 ) + x(tj )
y(tj ) ’ y(tj’1 ) =
E [Dx(tj ) · y(tj ) + x(tj )D— y(tj )]
lim =
E [Dx(t) · y(t) + x(t)D— y(t)] dt.


Now let us assume that the past Pt and the future Ft are condi-
tionally independent given the present Pt © Ft . That is, if f is any
Ft -measurable function in L 1 then E{f | Pt } = E{f | Pt © Ft }, and if f
is any Pt -measurable function in L 1 then E{f | Ft } = E{f | Pt © Ft }.
If x is a Markov process and Pt is generated by the x(s) with s ¤ t, and
Ft by the x(s) with s ≥ t, this is certainly the case. However, the as-
sumption is much weaker. It applies, for example, to the position x(t) of
the Ornstein-Uhlenbeck process. The reason is that the present Pt © Ft
may not be generated by x(t); for example, in the Ornstein-Uhlenbeck
case v(t) = dx(t)/dt is also Pt © Ft -measurable.
With the above assumption on the Pt and Ft , if x is an (S1) process
then Dx(t) and D— x(t) are Pt ©Ft -measurable, and we can form DD— x(t)
and D— Dx(t) if they exist. Assuming they exist, we de¬ne
1 1
a(t) = DD— x(t) + D— Dx(t) (11.15)
2 2

and call it the mean second derivative or mean acceleration.
If x is a su¬ciently smooth function of t then a(t) = d2 x(t)/dt2 . This
is also true of other possible candidates for the title of mean acceleration,
such as DD— x(t), D— Dx(t), DDx(t), D— D— x(t), and 1 DDx(t) + 1 D— D— x(t).
2 2
Of these the ¬rst four distinguish between the two choices of direction for
the time axis, and so can be discarded. To discuss the ¬fth possibility,
consider the Gaussian Markov process x(t) satisfying

dx(t) = ’ωx(t) dt + dw(t),

where w is a Wiener process, in equilibrium (that is, with the invariant
Gaussian measure as initial measure). Then

Dx(t) = ’ωx(t),
D— x(t) = ωx(t),
a(t) = ’ω 2 x(t),

1 1
DDx(t) + D— D— x(t) = ω 2 x(t).
2 2
This process is familiar to us: it is the position in the Smoluchowski de-
scription of the highly overdamped harmonic oscillator (or the velocity
of a free particle in the Ornstein-Uhlenbeck theory). The characteristic
feature of this process is its constant tendency to go towards the origin,
no matter which direction of time is taken. Our de¬nition of mean ac-
celeration, which gives a(t) = ’ω 2 x(t), is kinematically the appropriate


The stochastic integral was invented by Itˆ: o
[27]. Kiyosi Itˆ, “On Stochastic Di¬erential Equations”, Memoirs of the
American Mathematical Society, Number 4 (1951).
Doob gave a treatment based on martingales [15, §6, pp. 436“451].
Our discussion of stochastic integrals, as well as most of the other material
of this section, is based on Doob™s book.
Chapter 12

Dynamics of stochastic motion

The fundamental law of non-relativistic dynamics is Newton™s law
F = ma: the force on a particle is the product of the particle™s mass
and the acceleration of the particle. This law is, of course, nothing but
the de¬nition of force. Most de¬nitions are trivial”others are profound.
Feynman [28] has analyzed the characteristics that make Newton™s de¬-
nition profound:
“It implies that if we study the mass times the acceleration and call
the product the force, i.e., if we study the characteristics of force as a
program of interest, then we shall ¬nd that forces have some simplicity;
the law is a good program for analyzing nature, it is a suggestion that
the forces will be simple.”
Now suppose that x is a stochastic process representing the motion
of a particle of mass m. Leaving unanalyzed the dynamical mechanism
causing the random ¬‚uctuations, we can ask how to express the fact that
there is an external force F acting on the particle. We do this simply by
F = ma
where a is the mean acceleration (Chapter 11).
For example, suppose that x is the position in the Ornstein-Uhlenbeck
theory of Brownian motion, and suppose that the external force is F =
’ grad V where exp(’V D/mβ) is integrable. In equilibrium, the particle
has probability density a normalization constant times exp(’V D/mβ)
and satis¬es
dx(t) = v(t)dt
dv(t) = ’βv(t)dt + K x(t) dt + dB(t),


where K = F/m = ’ grad V /m, and B has variance parameter 2β 2 D.

Dx(t) = D— x(t) = v(t),
Dv(t) = ’βv(t) + K x(t) ,
D— v(t) = βv(t) + K x(t) ,
a(t) = K x(t) .

Therefore the law F = ma holds.


[28]. Richard P. Feynman, Robert B. Leighton, and Matthew Sands, “The
Feynman Lectures on Physics”, Addison-Wesley, Reading, Massachusetts,
Chapter 13

Kinematics of Markovian

At this point I shall cease making regularity assumptions explicit.
Whenever we take the derivative of a function, the function is assumed
to be di¬erentiable. Whenever we take D of a stochastic process, it is
assumed to exist. Whenever we consider the probability density of a ran-
dom variable, it is assumed to exist. I do this not out of laziness but out
of ignorance. The problem of ¬nding convenient regularity assumptions
for this discussion and later applications of it (Chapter 15) is a non-trivial
Consider a Markov process x on ‚ of the form
dx(t) = b x(t), t)dt + dw(t),

where w is a Wiener process on ‚ with di¬usion coe¬cient ν (we write
ν instead of D to avoid confusion with mean forward derivatives). Here
b is a ¬xed smooth function on ‚ +1 . The w(t) ’ w(s) are independent of
the x(r) whenever r ¤ s and r ¤ t, so that
Dx(t) = b x(t), t .
A Markov process with time reversed is again a Markov process (see
Doob [15, §6, p. 83]), so we can de¬ne b— by


. 13
( 18 .)