. 7
( 19 .)


probability P is a measure that has total mass 1. So we de¬ne

EX = X(ω) P(dω).

To recall how the de¬nition goes, we say X is simple if X(ω) = ai 1Ai (ω) with each
ai ≥ 0, and for a simple X we de¬ne
EX = ai P(Ai ).

If X is nonnegative, we de¬ne

E X = sup{E Y : Y simple , Y ¤ X}.

Finally, provided at least one of E X + and E X ’ is ¬nite, we de¬ne

E X = E X + ’ E X ’.

This is the same de¬nition as described above.
Note 2. The Radon-Nikodym theorem from measure theory says that if Q and P are two
¬nite measures on („¦, G) and Q(A) = 0 whenever P(A) = 0 and A ∈ G, then there exists an
integrable function Y that is G-measurable such that Q(A) = A Y dP for every measurable
set A.
Let us apply the Radon-Nikodym theorem to the following situation. Suppose („¦, F, P)
is a probability space and X ≥ 0 is integrable: E X < ∞. Suppose G ‚ F. De¬ne two new
probabilities on G as follows. Let P = P|G , that is, P (A) = P(A) if A ∈ G and P (A) is not
de¬ned if A ∈ F ’ G. De¬ne Q by Q(A) = A XdP = E [X; A] if A ∈ G. One can show
(using the monotone convergence theorem from measure theory) that Q is a ¬nite measure on
G. (One can also use this de¬nition to de¬ne Q(A) for A ∈ F, but we only want to de¬ne Q
on G, as we will see in a moment.) So Q and P are two ¬nite measures on („¦, G). If A ∈ G
and P (A) = 0, then P(A) = 0 and so it follows that Q(A) = 0. By the Radon-Nikodym
theorem there exists an integrable random variable Y such that Y is G measurable (this is why
we worried about which σ-¬eld we were working with) and

Q(A) = Y dP

if A ∈ G. Note
((a) Y is G measurable, and
(b) if A ∈ G,
E [Y ; A] = E [X; A]


E [Y ; A] = E [Y 1A ] = Y dP = Y dP = Q(A) = XdP = E [X1A ] = E [X; A].

We de¬ne E [X | G] to be the random variable Y . If X is integrable but not necessarily
nonnegative, then X + and X ’ will be integrable and we de¬ne

E [X | G] = E [X + | G] ’ E [X ’ | G].

We de¬ne
P(B | G) = E [1B | G]

if B ∈ F.
Let us show that there is only one r.v., up to almost sure equivalence, that satis¬es (a)
and (b) above. If Y and Z are G measurable, and E [Y ; A] = E [X; A] = E [Z; A] for A ∈ G,
then the set An = (Y > Z + n ) will be in G, and so

1 1
E [Z; An ] + n P(An ) = E [Z + n ; An ] ¤ E [Y ; An ] = E [Z; An ].

Consequently P(An ) = 0. This is true for each positive integer n, so P(Y > Z) = 0. By
symmetry, P(Z > Y ) = 0, and therefore P(Y = Z) = 0 as we wished.
If one checks the proofs of Propositions 2.3, 2.4, and 2.5, one sees that only properties
(a) and (b) above were used. So the propositions hold for the new de¬nition of conditional
expectation as well.
In the case where G is ¬nitely or countably generated, under both the new and old
de¬nitions (a) and (b) hold. By the uniqueness result, the new and old de¬nitions agree.

10. Stochastic processes.
We will be talking about stochastic processes. Previously we discussed sequences
S1 , S2 , . . . of r.v.™s. Now we want to talk about processes Yt for t ≥ 0. For example, we
can think of St being the price of a stock at time t. Any nonnegative time t is allowed.
We typically let Ft be the smallest σ-¬eld with respect to which Ys is measurable
for all s ¤ t. So Ft = σ(Ys : s ¤ t). As you might imagine, there are a few technicalities
one has to worry about. We will try to avoid thinking about them as much as possible,
but see Note 1.
We call a collection of σ-¬elds Ft with Fs ‚ Ft if s < t a ¬ltration. We say the
¬ltration satis¬es the “usual conditions” if the Ft are right continuous and complete (see
Note 1); all the ¬ltrations we consider will satisfy the usual conditions.
We say a stochastic process has continuous paths if the following holds. For each
ω, the map t ’ Yt (ω) de¬nes a function from [0, ∞) to R. If this function is a continuous
function for all ω™s except for a set of probability zero, we say Yt has continuous paths.

De¬nition 10.1. A mapping „ : „¦ ’ [0, ∞) is a stopping time if for each t we have

(„ ¤ t) ∈ Ft .

Typically, „ will be a continuous random variable and P(„ = t) = 0 for each t, which
is why we need a de¬nition just a bit di¬erent from the discrete case.
Since („ < t) = ∪∞ („ ¤ t ’ n ) and („ ¤ t ’ n ) ∈ Ft’ n ‚ Ft , then for a stopping
1 1
time „ we have („ < t) ∈ Ft for all t.
Conversely, suppose „ is a nonnegative r.v. for which („ < t) ∈ Ft for all t. We
claim „ is a stopping time. The proof is easy, but we need the right continuity of the Ft
here, so we put the proof in Note 2.
A continuous time martingale (or submartingale) is what one expects: each Mt is
integrable, each Mt is Ft measurable, and if s < t, then

E [Mt | Fs ] = Ms .

(Here we are saying the left hand side and the right hand side are equal almost surely; we
will usually not write the “a.s.” since almost all of our equalities for random variables are
only almost surely.)
The analogues of Doob™s theorems go through. Note 3 has the proofs.
Note 1. For technical reasons, one typically de¬nes Ft as follows. Let Ft = σ(Ys : s ¤ t).
This is what we referred to as Ft above. Next add to Ft all sets N for which P(N ) = 0. Such
sets are called null sets, and since they have probability 0, they don™t a¬ect anything. In fact,
one wants to add all sets N that we think of being null sets, even though they might not be

measurable. To be more precise, we say N is a null set if inf{P(A) : A ∈ F, N ‚ A} = 0.
0 00
Recall we are starting with a σ-¬eld F and all the Ft ™s are contained in F. Let Ft be the σ-
0 0
¬eld generated by Ft and all null sets N , that is, the smallest σ-¬eld containing Ft and every
null set. In measure theory terminology, what we have done is to say Ft is the completion of
Ft .
Lastly, we want to make our σ-¬elds right continuous. We set Ft = ©µ>0 Ft+µ . Al-
though the union of σ-¬elds is not necessarily a σ-¬eld, the intersection of σ-¬elds is. Ft
contains Ft but might possibly contain more besides. An example of an event that is in Ft
but that may not be in Ft is

A = {ω : lim Yt+ n (ω) ≥ 0}.

00 00
A ∈ Ft+ 1 for each m, so it is in Ft . There is no reason it needs to be in Ft if Y is not
necessarily continuous at t. It is easy to see that ©µ>0 Ft+µ = Ft , which is what we mean
when we say Ft is right continuous.
When talking about a stochastic process Yt , there are various types of measurability
one can consider. Saying Yt is adapted to Ft means Yt is Ft measurable for each t. However,
since Yt is really a function of two variables, t and ω, there are other notions of measurability
that come into play. We will be considering stochastic processes that have continuous paths or
that are predictable (the de¬nition will be given later), so these various types of measurability
will not be an issue for us.

Note 2. Suppose („ < t) ∈ Ft for all t. Then for each positive integer n0 ,

(„ ¤ t) = ©∞ 0 („ < t + n ).

For n ≥ n0 we have („ < t + n ) ∈ Ft+ n ‚ Ft+ n1 . Therefore („ ¤ t) ∈ Ft+ n1 for each n0 .
0 0
Hence the set is in the intersection: ©n0 >1 Ft+ n1 ‚ ©µ>0 Ft+µ = Ft .

Note 3. We want to prove the analogues of Theorems 5.3 and 5.4. The proof of Doob™s
inequalities are simpler. We only will need the analogue of Theorem 5.4(b).

Theorem 10.2. Suppose Mt is a martingale with continuous paths and E Mt2 < ∞ for
all t. Then for each t0
E [(sup Ms )2 ] ¤ 4E [|Mt0 |2 ].

Proof. By the de¬nition of martingale in continuous time, Nk is a martingale in discrete time
with respect to Gk when we set Nk = Mkt0 /2n and Gk = Fkt0 /2n . By Theorem 5.4(b)

E [ max n Mkt0 /2n ] = E [ max n Nk ] ¤ 4E N2n = 4E Mt2 .
2 2 2
0¤k¤2 0¤k¤2

(Recall (maxk ak )2 = max a2 if all the ak ≥ 0.)
Now let n ’ ∞. Since Mt has continuous paths, max0¤k¤2n Mkt0 /2n increases up to
sups¤t0 Ms . Our result follows from the monotone convergence theorem from measure theory
(see Note 4).

We now prove the analogue of Theorem 5.3. The proof is simpler if we assume that
E Mt2 is ¬nite; the result is still true without this assumption.

Theorem 10.3. Suppose Mt is a martingale with continuous paths, E Mt2 < ∞ for all t,
and „ is a stopping time bounded almost surely by t0 . Then E M„ = E Mt0 .

Proof. We approximate „ by stopping times taking only ¬nitely many values. For n > 0
„n (ω) = inf{kt0 /2n : „ (ω) < kt0 /2n }.

„n takes only the values kt0 /2n for some k ¤ 2n . The event („n ¤ jt0 /2n ) is equal to
(„ < jt0 /2n ), which is in Fjt0 /2n since „ is a stopping time. So („n ¤ s) ∈ Fs if s is of the
form jt0 /2n for some j. A moment™s thought, using the fact that „n only takes values of the
form kt0 /2n , shows that „n is a stopping time.
It is clear that „n “ „ for every ω. Since Mt has continuous paths, M„n ’ M„ a.s.
Let Nk and Gk be as in the proof of Theorem 10.2. Let σn = k if „n = kt0 /2n . By
Theorem 5.3,
E Nσn = E N2n ,

which is the same as saying
E M„n = E Mt0 .

To complete the proof, we need to show E M„n converges to E M„ . This is almost
obvious, because we already observed that M„n ’ M„ a.s. Without the assumption that
E Mt2 < ∞ for all t, this is actually quite a bit of work, but with the assumption it is not too
Either |M„n ’ M„ | is less than or equal to 1 or greater than 1. If it is greater than 1,
it is less than |M„n ’ M„ |2 . So in either case,

|M„n ’ M„ | ¤ 1 + |M„n ’ M„ |2 . (10.1)

Because both |M„n | and |M„ | are bounded by sups¤t0 |Ms |, the right hand side of (10.1) is
bounded by 1 + 4 sups¤t0 |Ms |2 , which is integrable by Theorem 10.2. |M„n ’ M„ | ’ 0, and
so by the dominated convergence theorem from measure theory (Note 4),

E |M„n ’ M„ | ’ 0.

|E M„n ’ E M„ | = |E (M„n ’ M„ )| ¤ E |M„n ’ M„ | ’ 0.

Note 4. The dominated convergence theorem says that if Xn ’ X a.s. and |Xn | ¤ Y a.s.
for each n, where E Y < ∞, then E Xn ’ E X.
The monotone convergence theorem says that if Xn ≥ 0 for each n, Xn ¤ Xn+1 for
each n, and Xn ’ X, then E Xn ’ E X.

11. Brownian motion.
First, let us review a few facts about normal random variables. We say X is a
normal random variable with mean a and variance b2 if
1 2

P(c ¤ X ¤ d) = dy

and we will abbreviate this by saying X is N (a, b2 ). If X is N (a, b2 ), then E X = a,
Var X = b2 , and E |X|p < ∞ is ¬nite for every positive integer p. Moreover
E etX = eat et b /2


. 7
( 19 .)