Let us look at an example of a European call so that it is clear how to do the

calculations. Consider the binomial asset pricing model with n = 3, u = 2, d = 1 , r = 0.1,

2

S0 = 10, and K = 15. If V is a European call with strike price K and exercise date n, let

us compute explicitly the random variables V1 and V2 and calculate the value V0 . Let us

also compute the hedging strategy ∆0 , ∆1 , and ∆2 .

Let

(1 + r) ’ d u ’ (1 + r)

p= = .4, q= = .6.

u’d u’d

The following table describes the values of the stock, the payo¬ V , and the probabilities

for each possible outcome ω.

33

ω S1 S2 S3 V Probability

p3

10u2 10u3

HHH 10u 65

p2 q

10u2 10u2 d

HHT 10u 5

p2 q

10u2 d

HTH 10u 10ud 5

pq 2

10ud2

HTT 10u 10ud 0

p2 q

10u2 d

THH 10d 10ud 5

pq 2

10ud2

THT 10d 10ud 0

pq 2

10d2 10ud2

TTH 10d 0

q3

10d2 10d3

TTT 10d 0

We then calculate

V0 = (1 + r)’3 E V = (1 + r)’3 (65p3 + 15p2 q) = 4.2074.

V1 = (1 + r)’2 E [V | F1 ], so we have

V1 (H) = (1 + r)’2 (65p2 + 10pq) = 10.5785, V1 (T ) = (1 + r)’2 5pq = .9917.

V2 = (1 + r)’1 E [V | F2 ], so we have

V2 (HH) = (1 + r)’1 (65p + 5q) = 24.5454, V2 (HT ) = (1 + r)’1 5p = 1.8182,

V2 (T H) = (1 + r)’1 5p = 1.8182, V2 (T T ) = 0.

The formula for ∆k is given by

Vk+1 (H) ’ Vk+1 (T )

∆k = ,

Sk+1 (H) ’ Sk+1 (T )

so

V1 (H) ’ V1 (T )

∆0 = = .6391,

S1 (H) ’ S1 (T )

where V1 and S1 are as above.

V2 (HH) ’ V2 (HT ) V2 (T H) ’ V2 (T T )

∆1 (H) = = .7576, ∆1 (T ) = = .2424.

S2 (HH) ’ S2 (HT ) S2 (T H) ’ S2 (T T )

V3 (HHH) ’ V3 (HHT )

∆2 (HH) = = 1.0,

S3 (HHH) ’ S3 (HHT )

V3 (HT H) ’ V3 (HT T )

∆2 (HT ) = = .3333,

S3 (HT H) ’ S3 (HT T )

V3 (T HH) ’ V3 (T HT )

∆2 (T H) = = .3333,

S3 (T HH) ’ S3 (T HT )

V3 (T T H) ’ V3 (T T T )

∆2 (T T ) = = 0.0.

S3 (T T H) ’ S3 (T T T )

34

Note 1. The second equality is (7.1) is not entirely obvious. Intuitively, it says that one has

a heads with probability p and the value of Vk+1 is Vk+1 (H) and one has tails with probability

q, and the value of Vk+1 is Vk+1 (T ).

Let us give a more rigorous proof of (7.1). The right hand side of (7.1) is Fk measurable,

so we need to show that if A ∈ Fk , then

E [Vk+1 ; A] = E [pVk+1 (H) + qVk+1 (T ); A].

By linearity, it su¬ces to show this for A = {ω = (t1 t2 · · · tn ) : t1 = s1 , . . . , tk = sk }, where

s1 s2 · · · sk is any sequence of H™s and T ™s. Now

E [Vk+1 ; s1 · · · sk ] = E [Vk+1 ; s1 · · · sk H] + E [Vk+1 ; s1 · · · sk T ]

= Vk+1 (s1 · · · sk H)P(s1 · · · sk H) + Vk+1 (s1 · · · sk T )P(s1 · · · sk T ).

By independence this is

Vk+1 (s1 · · · sk H)P(s1 · · · sk )p + Vk+1 (s1 · · · sk T )P(s1 · · · sk )q,

which is what we wanted.

35

8. American options.

An American option is one where you can exercise the option any time before some

¬xed time T . For example, on a European call, one can only use it to buy a share of stock

at the expiration time T , while for an American call, at any time before time T , one can

decide to pay K dollars and obtain a share of stock.

Let us give an informal argument on how to price an American call, giving a more

rigorous argument in a moment. One can always wait until time T to exercise an American

call, so the value must be at least as great as that of a European call. On the other hand,

suppose you decide to exercise early. You pay K dollars, receive one share of stock, and

your wealth is St ’ K. You hold onto the stock, and at time T you have one share of stock

worth ST , and for which you paid K dollars. So your wealth is ST ’ K ¤ (ST ’ K)+ . In

fact, we have strict inequality, because you lost the interest on your K dollars that you

would have received if you had waited to exercise until time T . Therefore an American

call is worth no more than a European call, and hence its value must be the same as that

of a European call.

This argument does not work for puts, because selling stock gives you some money

on which you will receive interest, so it may be advantageous to exercise early. (A put is

the option to sell a stock at a price K at time T .)

Here is the more rigorous argument. Suppose that if you exercise the option at time

k, your payo¬ is g(Sk ). In present day dollars, that is, after correcting for in¬‚ation, you

have (1 + r)’k g(Sk ). You have to make a decision on when to exercise the option, and that

decision can only be based on what has already happened, not on what is going to happen

in the future. In other words, we have to choose a stopping time „ , and we exercise the

option at time „ (ω). Thus our payo¬ is (1 + r)’„ g(S„ ). This is a random quantity. What

we want to do is ¬nd the stopping time that maximizes the expected value of this random

variable. As usual, we work with P, and thus we are looking for the stopping time „ such

that „ ¤ n and

E (1 + r)’„ g(S„ )

is as large as possible. The problem of ¬nding such a „ is called an optimal stopping

problem.

Suppose g(x) is convex with g(0) = 0. Certainly g(x) = (x’K)+ is such a function.

We will show that „ ≡ n is the solution to the above optimal stopping problem: the best

time to exercise is as late as possible.

We have

g(»x) = g(»x + (1 ’ ») · 0) ¤ »g(x) + (1 ’ »)g(0) = »g(x), 0 ¤ » ¤ 1. (8.1)

36

By Jensen™s inequality,

1

E [(1 + r)’(k+1) g(Sk+1 ) | Fk ] = (1 + r)’k E g(Sk+1 ) | Fk

1+r

1

≥ (1 + r)’k E g Sk+1 | Fk

1+r

1

≥ (1 + r)’k g E Sk+1 | Fk

1+r

= (1 + r)’k g(Sk ).

For the ¬rst inequality we used (8.1). So (1 + r)’k g(Sk ) is a submartingale. By optional

stopping,

E [(1 + r)’„ g(S„ )] ¤ E [(1 + r)’n g(Sn )],

so „ ≡ n always does best.

For puts, the payo¬ is g(Sk ), where g(x) = (K ’ x)+ . This is also convex function,

but this time g(0) = 0, and the above argument fails.

Although good approximations are known, an exact solution to the problem of

valuing an American put is unknown, and is one of the major unsolved problems in ¬nancial

mathematics.

37

9. Continuous random variables.

We are now going to start working toward continuous times and stocks that can

take any positive number as a value, so we need to prepare by extending some of our

de¬nitions.

Given any random variable X ≥ 0, we can approximate it by r.v™s Xn that are

discrete. We let

n2n

i

Xn = 1(i/2n ¤X<(i+1)/2n ) .

2n

i=0

In words, if X(ω) lies between 0 and n, we let Xn (ω) be the closest value i/2n that is

less than or equal to X(ω). For ω where X(ω) > n + 2’n we set Xn (ω) = 0. Clearly

the Xn are discrete, and approximate X. In fact, on the set where X ¤ n, we have that

|X(ω) ’ Xn (ω)| ¤ 2’n .

For reasonable X we are going to de¬ne E X = lim E Xn . Since the Xn increase

with n, the limit must exist, although it could be +∞. If X is not necessarily nonnegative,

we de¬ne E X = E X + ’ E X ’ , provided at least one of E X + and E X ’ is ¬nite. Here

X + = max(X, 0) and X ’ = max(’X, 0).

There are some things one wants to prove, but all this has been worked out in

measure theory and the theory of the Lebesgue integral; see Note 1. Let us con¬ne ourselves

here to showing this de¬nition is the same as the usual one when X has a density.

Recall X has a density fX if

b

P(X ∈ [a, b]) = fX (x)dx

a

for all a and b. In this case

∞

EX = xfX (x)dx

’∞

∞

|x|fX (x)dx < ∞. With our de¬nition of Xn we have

provided ’∞

(i+1)/2n

P(Xn = i/2n ) = P(X ∈ [i/2n , (i + 1)/2n )) = fX (x)dx.

i/2n

Then

(i+1)/2n

i i

P(Xn = i/2n ) =

E Xn = fX (x)dx.

2n 2n

i/2n

i i

Since x di¬ers from i/2n by at most 1/2n when x ∈ [i/2n , (i + 1)/2n ), this will tend to

xfX (x)dx, unless the contribution to the integral for |x| ≥ n does not go to 0 as n ’ ∞.

As long as |x|fX (x)dx < ∞, one can show that this contribution does indeed go to 0.

38

We also need an extension of the de¬nition of conditional probability. A r.v. is G

measurable if (X > a) ∈ G for every a. How do we de¬ne E [Z | G] when G is not generated

by a countable collection of disjoint sets?

Again, there is a completely worked out theory that holds in all cases; see Note 2.

Let us give a de¬nition that is equivalent that works except for a very few cases. Let us

suppose that for each n the σ-¬eld Gn is ¬nitely generated. This means that Gn is generated

by ¬nitely many disjoint sets Bn1 , . . . , Bnmn . So for each n, the number of Bni is ¬nite but

arbitrary, the Bni are disjoint, and their union is „¦. Suppose also that G1 ‚ G2 ‚ · · ·. Now

∪n Gn will not in general be a σ-¬eld, but suppose G is the smallest σ-¬eld that contains

all the Gn . Finally, de¬ne P(A | G) = lim P(A | Gn ).

This is a fairly general set-up. For example, let „¦ be the real line and let Gn be

generated by the sets (’∞, n), [n, ∞) and [i/2n , (i + 1)/2n ). Then G will contain every

interval that is closed on the left and open on the right, hence G must be the σ-¬eld that

one works with when one talks about Lebesgue measure on the line.

The question that one might ask is: how does one know the limit exists? Since

the Gn increase, we know by Proposition 4.3 that Mn = P(A | Gn ) is a martingale with

respect to the Gn . It is certainly bounded above by 1 and bounded below by 0, so by the

martingale convergence theorem, it must have a limit as n ’ ∞.

Once one has a de¬nition of conditional probability, one de¬nes conditional expec-

tation by what one expects. If X is discrete, one can write X as j aj 1Aj and then one

de¬nes

E [X | G] = aj P(Aj | G).

j

If the X is not discrete, one write X = X + ’X ’ , one approximates X + by discrete random

variables, and takes a limit, and similarly for X ’ . One has to worry about convergence,

but everything does go through.

With this extended de¬nition of conditional expectation, do all the properties of

Section 2 hold? The answer is yes. See Note 2 again.

With continuous random variables, we need to be more cautious about what we

mean when we say two random variables are equal. We say X = Y almost surely, abbre-

viated “a.s.”, if

P({ω : X(ω) = Y (ω)}) = 0.

So X = Y except for a set of probability 0. The a.s. terminology is used other places as

well: Xn ’ Y a.s. means that except for a set of ω™s of probability zero, Xn (ω) ’ Y (ω).

Note 1. The best way to de¬ne expected value is via the theory of the Lebesgue integral. A