<< Предыдущая стр. 3(из 19 стр.)ОГЛАВЛЕНИЕ Следующая >>
E [E [X | H] | G] = E [X | H] = E [E [X | G] | H].

Proof. E [X | H] is H measurable, hence G measurable, since H вЉ‚ G. The left hand
equality now follows by Proposition 3.5(3). To get the right hand equality, let W be the
right hand expression. It is H measurable, and if C в€€ H вЉ‚ G, then

E [W ; C] = E [E [X | G]; C] = E [X; C]

as required.

In words, if we are predicting a prediction of X given limited information, this is
the same as a single prediction given the least amount of information.
Let us verify that conditional expectation may be viewed as the best predictor of
a random variable given a Пѓ-п¬Ѓeld. If X is a r.v., a predictor Z is just another random
variable, and the goodness of the prediction will be measured by E [(X в€’ Z)2 ], which is
known as the mean square error.
Proposition 3.8. If X is a r.v., the best predictor among the collection of G-measurable
random variables is Y = E [X | G].

Proof. Let Z be any G-measurable random variable. We compute, using Proposition
3.5(3) and Proposition 3.6,

E [(X в€’ Z)2 | G] = E [X 2 | G] в€’ 2E [XZ | G] + E [Z 2 | G]
= E [X 2 | G] в€’ 2ZE [X | G] + Z 2
= E [X 2 | G] в€’ 2ZY + Z 2
= E [X 2 | G] в€’ Y 2 + (Y в€’ Z)2
= E [X 2 | G] в€’ 2Y E [X | G] + Y 2 + (Y в€’ Z)2
= E [X 2 | G] в€’ 2E [XY | G] + E [Y 2 | G] + (Y в€’ Z)2
= E [(X в€’ Y )2 | G] + (Y в€’ Z)2 .

We also used the fact that Y is G measurable. Taking expectations and using Proposition
3.5(4),
E [(X в€’ Z)2 ] = E [(X в€’ Y )2 ] + E [(Y в€’ Z)2 ].

The right hand side is bigger than or equal to E [(X в€’ Y )2 ] because (Y в€’ Z)2 в‰Ґ 0. So the
error in predicting X by Z is larger than the error in predicting X by Y , and will be equal
if and only if Z = Y . So Y is the best predictor.

14
There is one more interpretation of conditional expectation that may be useful. The
collection of all random variables is a linear space, and the collection of all G-measurable
random variables is clearly a subspace. Given X, the conditional expectation Y = E [X | G]
is equal to the projection of X onto the subspace of G-measurable random variables. To
see this, we write X = Y + (X в€’ Y ), and what we have to check is that the inner product
of Y and X в€’ Y is 0, that is, Y and X в€’ Y are orthogonal. In this context, the inner
product of X1 and X2 is deп¬Ѓned to be E [X1 X2 ], so we must show E [Y (X в€’ Y )] = 0. Note

E [Y (X в€’ Y ) | G] = Y E [X в€’ Y | G] = Y (E [X | G] в€’ Y ) = Y (Y в€’ Y ) = 0.

Taking expectations,

E [Y (X в€’ Y )] = E [E [Y (X в€’ Y ) | G] ] = 0,

just as we wished.

If Y is a discrete random variable, that is, it takes only countably many values
y1 , y2 , . . ., we let Bi = (Y = yi ). These will be disjoint sets whose union is в„¦. If Пѓ(Y )
is the collection of all unions of the Bi , then Пѓ(Y ) is a Пѓ-п¬Ѓeld, and is called the Пѓ-п¬Ѓeld
generated by Y . It is easy to see that this is the smallest Пѓ-п¬Ѓeld with respect to which Y
is measurable. We write E [X | Y ] for E [X | Пѓ(Y )].

Note 1. We prove Proposition 3.5. (1) and (2) are immediate from the deп¬Ѓnition. To prove
(3), note that if Z = X, then Z is G measurable and E [X; C] = E [Z; C] for any C в€€ G; this
is trivial. By Proposition 3.4 it follows that Z = E [X | G];this proves (3). To prove (4), if we
let C = в„¦ and Y = E [X | G], then E Y = E [Y ; C] = E [X; C] = E X.
Last is (5). Let Z = E X. Z is constant, so clearly G measurable. By the in-
dependence, if C в€€ G, then E [X; C] = E [X1C ] = (E X)(E 1C ) = (E X)(P(C)). But
E [Z; C] = (E X)(P(C)) since Z is constant. By Proposition 3.4 we see Z = E [X | G].

Note 2. We prove Proposition 3.6. Note that ZE [X | G] is G measurable, so by Proposition
3.4 we need to show its expectation over sets C in G is the same as that of XZ. As in the
proof of Proposition 3.3, it suп¬ѓces to consider only the case when C is one of the Bi . Now Z
is G measurable, hence it is constant on Bi ; let its value be zi . Then

E [ZE [X | G]; Bi ] = E [zi E [X | G]; Bi ] = zi E [E [X | G]; Bi ] = zi E [X; Bi ] = E [XZ; Bi ]

as desired.

15
4. Martingales.
Suppose we have a sequence of Пѓ-п¬Ѓelds F1 вЉ‚ F2 вЉ‚ F3 В· В· В·. An example would be
repeatedly tossing a coin and letting Fk be the sets that can be determined by the п¬Ѓrst
k tosses. Another example is to let Fk be the events that are determined by the values
of a stock at times 1 through k. A third example is to let X1 , X2 , . . . be a sequence of
random variables and let Fk be the Пѓ-п¬Ѓeld generated by X1 , . . . , Xk , the smallest Пѓ-п¬Ѓeld
with respect to which X1 , . . . , Xk are measurable.

Deп¬Ѓnition 4.1. A r.v. X is integrable if E |X| < в€ћ. Given an increasing sequence of
Пѓ-п¬Ѓelds Fn , a sequence of r.v.вЂ™s Xn is adapted if Xn is Fn measurable for each n.

Deп¬Ѓnition 4.2. A martingale Mn is a sequence of random variables such that

(1) Mn is integrable for all n,
(2) Mn is adapted to Fn , and
(3) for all n
E [Mn+1 | Fn ] = Mn . (4.1)

Usually (1) and (2) are easy to check, and it is (3) that is the crucial property. If
we have (1) and (2), but instead of (3) we have
(3 ) for all n
E [Mn+1 | Fn ] в‰Ґ Mn ,

then we say Mn is a submartingale. If we have (1) and (2), but instead of (3) we have
(3 ) for all n
E [Mn+1 | Fn ] в‰¤ Mn ,

then we say Mn is a supermartingale.
Submartingales tends to increase and supermartingales tend to decrease. The
nomenclature may seem like it goes the wrong way; Doob deп¬Ѓned these terms by anal-
ogy with the notions of subharmonic and superharmonic functions in analysis. (Actually,
it is more than an analogy: we wonвЂ™t explore this, but it turns out that the composition
of a subharmonic function with Brownian motion yields a submartingale, and similarly for
superharmonic functions.)
Note that the deп¬Ѓnition of martingale depends on the collection of Пѓ-п¬Ѓelds. When
it is needed for clarity, one can say that (Mn , Fn ) is a martingale. To deп¬Ѓne conditional
expectation, one needs a probability, so a martingale depends on the probability as well.
When we need to, we will say that Mn is a martingale with respect to the probability P.
This is an issue when there is more than one probability around.
We will see that martingales are ubiquitous in п¬Ѓnancial math. For example, security
prices and oneвЂ™s wealth will turn out to be examples of martingales.

16
The word вЂњmartingaleвЂќ is also used for the piece of a horseвЂ™s bridle that runs from
the horseвЂ™s head to its chest. It keeps the horse from raising its head too high. It turns out
that martingales in probability cannot get too large. The word also refers to a gambling
system. I did some searching on the Internet, and there seems to be no consensus on the
derivation of the term.
Here is an example of a martingale. Let X1 , X2 , . . . be a sequence of independent
r.v.вЂ™s with mean 0 that are independent. (Saying a r.v. Xi has mean 0 is the same as
saying E Xi = 0; this presupposes that E |X1 | is п¬Ѓnite.) Set Fn = Пѓ(X1 , . . . , Xn ), the
n
Пѓ-п¬Ѓeld generated by X1 , . . . , Xn . Let Mn = i=1 Xi . Deп¬Ѓnition 4.2(2) is easy to see.
n
Since E |Mn | в‰¤ i=1 E |Xi |, Deп¬Ѓnition 4.2(1) also holds. We now check
E [Mn+1 | Fn ] = X1 + В· В· В· + Xn + E [Xn+1 | Fn ] = Mn + E Xn+1 = Mn ,
where we used the independence.
Another example: suppose in the above that the Xk all have variance 1, and let
n
2
Mn = Sn в€’ n, where Sn = i=1 Xi . Again (1) and (2) of Deп¬Ѓnition 4.2 are easy to check.
We compute
2 2
E [Mn+1 | Fn ] = E [Sn + 2Xn+1 Sn + Xn+1 | Fn ] в€’ (n + 1).
2 2
We have E [Sn | Fn ] = Sn since Sn is Fn measurable.
E [2Xn+1 Sn | Fn ] = 2Sn E [Xn+1 | Fn ] = 2Sn E Xn+1 = 0.
2 2
And E [Xn+1 | Fn ] = E Xn+1 = 1. Substituting, we obtain E [Mn+1 | Fn ] = Mn , or Mn is
a martingale.
A third example: Suppose you start with a dollar and you are tossing a fair coin
independently. If it turns up heads you double your fortune, tails you go broke. This is
вЂњdouble or nothing.вЂќ Let Mn be your fortune at time n. To formalize this, let X1 , X2 , . . .
be independent r.v.вЂ™s that are equal to 2 with probability 1 and 0 with probability 1 . Then
2 2
Mn = X1 В· В· В· Xn . Let Fn be the Пѓ-п¬Ѓeld generated by X1 , . . . , Xn . Note 0 в‰¤ Mn в‰¤ 2n , and
so Deп¬Ѓnition 4.2(1) is satisп¬Ѓed, while (2) is easy. To compute the conditional expectation,
note E Xn+1 = 1. Then
E [Mn+1 | Fn ] = Mn E [Xn+1 | Fn ] = Mn E Xn+1 = Mn ,
using the independence.
Before we give our fourth example, let us observe that
|E [X | F]| в‰¤ E [|X| | F]. (4.2)
To see this, we have в€’|X| в‰¤ X в‰¤ |X|, so в€’E [|X| | F] в‰¤ E [X | F] в‰¤ E [|X| | F]. Since
E [|X| | F] is nonnegative, (4.2) follows.
Our fourth example will be used many times, so we state it as a proposition.

17
Proposition 4.3. Let F1 , F2 , . . . be given and let X be a п¬Ѓxed r.v. with E |X| < в€ћ. Let
Mn = E [X | Fn ]. Then Mn is a martingale.

Proof. Deп¬Ѓnition 4.2(2) is clear, while

E |Mn | в‰¤ E [E [|X| | Fn ]] = E |X| < в€ћ

by (4.2); this shows Deп¬Ѓnition 4.2(1). We have

E [Mn+1 | Fn ] = E [E [X | Fn+1 ] | Fn ] = E [X | Fn ] = Mn .

18
5. Properties of martingales.
When it comes to discussing American options, we will need the concept of stopping
times. A mapping П„ from в„¦ into the nonnegative integers is a stopping time if (П„ = k) в€€ Fk
for each k.
An example is П„ = min{k : Sk в‰Ґ A}. This is a stopping time because (П„ = k) =
(S1 , . . . , Skв€’1 < A, Sk в‰Ґ A) в€€ Fk . We can think of a stopping time as the п¬Ѓrst time
something happens. Пѓ = max{k : Sk в‰Ґ A}, the last time, is not a stopping time. (We will
use the convention that the minimum of an empty set is +в€ћ; so, for example, with the
above deп¬Ѓnition of П„ , on the event that Sk is never in A, we have П„ = в€ћ.
Here is an intuitive description of a stopping time. If I tell you to drive to the city
limits and then drive until you come to the second stop light after that, you know when
you get there that you have arrived; you donвЂ™t need to have been there before or to look
ahead. But if I tell you to drive until you come to the second stop light before the city
limits, either you must have been there before or else you have to go past where you are
supposed to stop, continue on to the city limits, and then turn around and come back two
stop lights. You donвЂ™t know when you п¬Ѓrst get to the second stop light before the city
limits that you get to stop there. The п¬Ѓrst set of instructions forms a stopping time, the
second set does not.
Note (П„ в‰¤ k) = в€Єk (П„ = j). Since (П„ = j) в€€ Fj вЉ‚ Fk , then the event (П„ в‰¤ k) в€€ Fk
j=0
for all k. Conversely, if П„ is a r.v. with (П„ в‰¤ k) в€€ Fk for all k, then

(П„ = k) = (П„ в‰¤ k) в€’ (П„ в‰¤ k в€’ 1).

Since (П„ в‰¤ k) в€€ Fk and (П„ в‰¤ k в€’ 1) в€€ Fkв€’1 вЉ‚ Fk , then (П„ = k) в€€ Fk , and such a П„ must
be a stopping time.
Our п¬Ѓrst result is JensenвЂ™s inequality.
Proposition 5.1. If g is convex, then

g(E [X | G]) в‰¤ E [g(X) | G]

provided all the expectations exist.
For ordinary expectations rather than conditional expectations, this is still true.
That is, if g is convex and the expectations exist, then

g(E X) в‰¤ E [g(X)].

We already know some special cases of this: when g(x) = |x|, this says |E X| в‰¤ E |X|;
when g(x) = x2 , this says (E X)2 в‰¤ E X 2 , which we know because E X 2 в€’ (E X)2 =
E (X в€’ E X)2 в‰Ґ 0.

19
For Proposition 5.1 as well as many of the following propositions, the statement of
the result is more important than the proof, and we relegate the proof to Note 1 below.
One reason we want JensenвЂ™s inequality is to show that a convex function applied
to a martingale yields a submartingale.
Proposition 5.2. If Mn is a martingale and g is convex, then g(Mn ) is a submartingale,
provided all the expectations exist.

Proof. By JensenвЂ™s inequality,

E [g(Mn+1 ) | Fn ] в‰Ґ g(E [Mn+1 | Fn ]) = g(Mn ).

If Mn is a martingale, then E Mn = E [E [Mn+1 | Fn ]] = E Mn+1 . So E M0 =
E M1 = В· В· В· = E Mn . DoobвЂ™s optional stopping theorem says the same thing holds when
п¬Ѓxed times n are replaced by stopping times.
Theorem 5.3. Suppose K is a positive integer, N is a stopping time such that N в‰¤ K
a.s., and Mn is a martingale. Then

E MN = E MK .

Here, to evaluate MN , one п¬Ѓrst п¬Ѓnds N (П‰) and then evaluates MВ· (П‰) for that value of N .
Proof. We have
K
E MN = E [MN ; N = k].
k=0

If we show that the k-th summand is E [Mn ; N = k], then the sum will be
K
E [Mn ; N = k] = E Mn
k=0

as desired. We have
E [MN ; N = k] = E [Mk ; N = k]

by the deп¬Ѓnition of MN . Now (N = k) is in Fk , so by Proposition 2.2 and the fact that
Mk = E [Mk+1 | Fk ],
E [Mk ; N = k] = E [Mk+1 ; N = k].

We have (N = k) в€€ Fk вЉ‚ Fk+1 . Since Mk+1 = E [Mk+2 | Fk+1 ], Proposition 2.2 tells us
that
E [Mk+1 ; N = k] = E [Mk+2 ; N = k].

 << Предыдущая стр. 3(из 19 стр.)ОГЛАВЛЕНИЕ Следующая >>