ńņš. 1 |

Spring 2003

Richard F. Bass

Department of Mathematics

University of Connecticut

These notes are c 2003 by Richard Bass. They may be used for personal use or

class use, but not for commercial purposes. If you ļ¬nd any errors, I would appreciate

hearing from you: bass@math.uconn.edu

1

1. Introduction.

In this course we will study mathematical ļ¬nance. Mathematical ļ¬nance is not

about predicting the price of a stock. What it is about is ļ¬guring out the price of options

and derivatives.

The most familiar type of option is the option to buy a stock at a given price at

a given time. For example, suppose Microsoft is currently selling today at $40 per share.

A European call option is something I can buy that gives me the right to buy a share of

Microsoft at some future date. To make up an example, suppose I have an option that

allows me to buy a share of Microsoft for $50 in three months time, but does not compel

me to do so. If Microsoft happens to be selling at $45 in three months time, the option is

worthless. I would be silly to buy a share for $50 when I could call my broker and buy it

for $45. So I would choose not to exercise the option. On the other hand, if Microsoft is

selling for $60 three months from now, the option would be quite valuable. I could exercise

the option and buy a share for $50. I could then turn around and sell the share on the

open market for $60 and make a proļ¬t of $10 per share. Therefore this stock option I

possess has some value. There is some chance it is worthless and some chance that it will

lead me to a proļ¬t. The basic question is: how much is the option worth today?

The huge impetus in ļ¬nancial derivatives was the seminal paper of Black and Scholes

in 1973. Although many researchers had studied this question, Black and Scholes gave a

deļ¬nitive answer, and a great deal of research has been done since. These are not just

academic questions; today the market in ļ¬nancial derivatives is larger than the market

in stock securities. In other words, more money is invested in options on stocks than in

stocks themselves.

Options have been around for a long time. The earliest ones were used by manu-

facturers and food producers to hedge their risk. A farmer might agree to sell a bushel of

wheat at a ļ¬xed price six months from now rather than take a chance on the vagaries of

market prices. Similarly a steel reļ¬nery might want to lock in the price of iron ore at a

ļ¬xed price.

The sections of these notes can be grouped into ļ¬ve categories. The ļ¬rst is elemen-

tary probability. Although someone who has had a course in undergraduate probability

will be familiar with some of this, we will talk about a number of topics that are not usu-

ally covered in such a course: Ļ-ļ¬elds, conditional expectations, martingales. The second

category is the binomial asset pricing model. This is just about the simplest model of a

stock that one can imagine, and this will provide a case where we can see most of the major

ideas of mathematical ļ¬nance, but in a very simple setting. Then we will turn to advanced

probability, that is, ideas such as Brownian motion, stochastic integrals, stochastic diļ¬er-

ential equations, Girsanov transformation. Although to do this rigorously requires measure

theory, we can still learn enough to understand and work with these concepts. We then

2

return to ļ¬nance and work with the continuous model. We will derive the Black-Scholes

formula, see the Fundamental Theorem of Asset Pricing, work with equivalent martingale

measures, and the like. The ļ¬fth main category is term structure models, which means

models of interest rate behavior.

I found some unpublished notes of Steve Shreve extremely useful in preparing these

notes. I hope that he has turned them into a book and that this book is now available.

The stochastic calculus part of these notes is from my own book: Probabilistic Techniques

in Analysis, Springer, New York, 1995.

I would also like to thank Evarist GinĀ“ who pointed out a number of errors.

e

3

2. Review of elementary probability.

Letā™s begin by recalling some of the deļ¬nitions and basic concepts of elementary

probability. We will only work with discrete models at ļ¬rst.

We start with an arbitrary set, called the probability space, which we will denote

by ā„¦, the capital Greek letter āomega.ā We are given a class F of subsets of ā„¦. These are

called events. We require F to be a Ļ-ļ¬eld.

Deļ¬nition 2.1. A collection F of subsets of ā„¦ is called a Ļ-ļ¬eld if

ā… ā F,

(1)

ā„¦ ā F,

(2)

A ā F implies Ac ā F, and

(3)

A1 , A2 , . . . ā F implies both āŖā Ai ā F and ā©ā Ai ā F.

(4) i=1 i=1

Here Ac = {Ļ ā ā„¦ : Ļ ā A} denotes the complement of A. ā… denotes the empty set, that

/

is, the set with no elements. We will use without special comment the usual notations of

āŖ (union), ā© (intersection), ā‚ (contained in), ā (is an element of).

Typically, in an elementary probability course, F will consist of all subsets of

ā„¦, but we will later need to distinguish between various Ļ-ļ¬elds. Here is an exam-

ple. Suppose one tosses a coin two times and lets ā„¦ denote all possible outcomes. So

ā„¦ = {HH, HT, T H, T T }. A typical Ļ-ļ¬eld F would be the collection of all subsets of ā„¦.

In this case it is trivial to show that F is a Ļ-ļ¬eld, since every subset is in F. But if

we let G = {ā…, ā„¦, {HH, HT }, {T H, T T }}, then G is also a Ļ-ļ¬eld. One has to check the

deļ¬nition, but to illustrate, the event {HH, HT } is in G, so we require the complement of

that set to be in G as well. But the complement is {T H, T T } and that event is indeed in

G.

One point of view which we will explore much more fully later on is that the Ļ-ļ¬eld

tells you what events you āknow.ā In this example, F is the Ļ-ļ¬eld where you āknowā

everything, while G is the Ļ-ļ¬eld where you āknowā only the result of the ļ¬rst toss but not

the second. We wonā™t try to be precise here, but to try to add to the intuition, suppose

one knows whether an event in F has happened or not for a particular outcome. We

would then know which of the events {HH}, {HT }, {T H}, or {T T } has happened and so

would know what the two tosses of the coin showed. On the other hand, if we know which

events in G happened, we would only know whether the event {HH, HT } happened, which

means we would know that the ļ¬rst toss was a heads, or we would know whether the event

{T H, T T } happened, in which case we would know that the ļ¬rst toss was a tails. But

there is no way to tell what happened on the second toss from knowing which events in G

happened. Much more on this later.

The third basic ingredient is a probability.

4

Deļ¬nition 2.2. A function P on F is a probability if it satisļ¬es

if A ā F, then 0 ā¤ P(A) ā¤ 1,

(1)

(2) P(ā„¦) = 1, and

(3) P(ā…) = 0, and

ā

if A1 , A2 , . . . ā F are pairwise disjoint, then P(āŖā Ai ) =

(4) P(Ai ).

i=1 i=1

A collection of sets Ai is pairwise disjoint if Ai ā© Aj = ā… unless i = j.

There are a number of conclusions one can draw from this deļ¬nition. As one

example, if A ā‚ B, then P(A) ā¤ P(B) and P(Ac ) = 1 ā’ P(A). See Note 1 at the end of

this section for a proof.

Someone who has had measure theory will realize that a Ļ-ļ¬eld is the same thing

as a Ļ-algebra and a probability is a measure of total mass one.

A random variable (abbreviated r.v.) is a function X from ā„¦ to R, the reals. To

be more precise, to be a r.v. X must also be measurable, which means that {Ļ : X(Ļ) ā„

a} ā F for all reals a.

The notion of measurability has a simple deļ¬nition but is a bit subtle. If we take

the point of view that we know all the events in G, then if Y is G-measurable, then we

know Y . Phrased another way, suppose we know whether or not the event has occurred

for each event in G. Then if Y is G-measurable, we can compute the value of Y .

Here is an example. In the example above where we tossed a coin two times, let X

be the number of heads in the two tosses. Then X is F measurable but not G measurable.

To see this, let us consider Aa = {Ļ ā ā„¦ : X(Ļ) ā„ a}. This event will equal

if a ā¤ 0;

ļ£“ā„¦

ļ£±

{HH, HT, T H} if 0 < a ā¤ 1;

ļ£²

ļ£“ {HH} if 1 < a ā¤ 2;

ļ£³

ā… if 2 < a.

For example, if a = 2 , then the event where the number of heads is 3 or greater is the

3

2

event where we had two heads, namely, {HH}. Now observe that for each a the event Aa

is in F because F contains all subsets of ā„¦. Therefore X is measurable with respect to F.

3

However it is not true that Aa is in G for every value of a ā“ take a = 2 as just one example

ā“ the subset {HH} is not in G. So X is not measurable with respect to the Ļ-ļ¬eld G.

A discrete r.v. is one where P(Ļ : X(Ļ) = a) = 0 for all but countably many aā™s,

say, a1 , a2 , . . ., and i P(Ļ : X(Ļ) = ai ) = 1. In deļ¬ning sets one usually omits the Ļ;

thus (X = x) means the same as {Ļ : X(Ļ) = x}.

In the discrete case, to check measurability with respect to a Ļ-ļ¬eld F, it is enough

that (X = a) ā F for all reals a. The reason for this is that if x1 , x2 , . . . are the values of

5

x for which P(X = x) = 0, then we can write (X ā„ a) = āŖxi ā„a (X = xi ) and we have a

countable union. So if (X = xi ) ā F, then (X ā„ a) ā F.

Given a discrete r.v. X, the expectation or mean is deļ¬ned by

EX = xP(X = x)

x

provided the sum converges. If X only takes ļ¬nitely many values, then this is a ļ¬nite sum

and of course it will converge. This is the situation that we will consider for quite some

time. However, if X can take an inļ¬nite number of values (but countable), convergence

needs to be checked. For example, if P(X = 2n ) = 2ā’n for n = 1, 2, . . ., then E X =

ā ā’n

n

n=1 2 Ā· 2 = ā.

There is an alternate deļ¬nition of expectation which is equivalent in the discrete

setting. Set

EX = X(Ļ)P({Ļ}).

Ļāā„¦

To see that this is the same, look at Note 2 at the end of the section. The advantage of the

second deļ¬nition is that some properties of expectation, such as E (X + Y ) = E X + E Y ,

are immediate, while with the ļ¬rst deļ¬nition they require quite a bit of proof.

We say two events A and B are independent if P(A ā© B) = P(A)P(B). Two random

variables X and Y are independent if P(X ā A, Y ā B) = P(X ā A)P(X ā B) for all A

and B that are subsets of the reals. The comma in the expression P(X ā A, Y ā B) means

āand.ā Thus

P(X ā A, Y ā B) = P((X ā A) ā© (Y ā B)).

The extension of the deļ¬nition of independence to the case of more than two events or

random variables is not surprising: A1 , . . . , An are independent if

P(Ai1 ā© Ā· Ā· Ā· ā© Aij ) = P(Ai1 ) Ā· Ā· Ā· P(Aij )

whenever {i1 , . . . , ij } is a subset of {1, . . . , n}.

A common misconception is that an event is independent of itself. If A is an event

that is independent of itself, then

P(A) = P(A ā© A) = P(A)P(A) = (P(A))2 .

The only ļ¬nite solutions to the equation x = x2 are x = 0 and x = 1, so an event is

independent of itself only if it has probability 0 or 1.

Two Ļ-ļ¬elds F and G are independent if A and B are independent whenever A ā F

and B ā G. A r.v. X and a Ļ-ļ¬eld G are independent if P((X ā A) ā© B) = P(X ā A)P(B)

whenever A is a subset of the reals and B ā G.

6

As an example, suppose we toss a coin two times and we deļ¬ne the Ļ-ļ¬elds G1 =

{ā…, ā„¦, {HH, HT }, {T H, T T }} and G2 = {ā…, ā„¦, {HH, T H}, {HT, T T }}. Then G1 and G2 are

independent if P(HH) = P(HT ) = P(T H) = P(T T ) = 1 . (Here we are writing P(HH)

4

when a more accurate way would be to write P({HH}).) An easy way to understand this

is that if we look at an event in G1 that is not ā… or ā„¦, then that is the event that the ļ¬rst

toss is a heads or it is the event that the ļ¬rst toss is a tails. Similarly, a set other than ā…

or ā„¦ in G2 will be the event that the second toss is a heads or that the second toss is a

tails.

If two r.v.s X and Y are independent, we have the multiplication theorem, which

says that E (XY ) = (E X)(E Y ) provided all the expectations are ļ¬nite. See Note 3 for a

proof.

Suppose X1 , . . . , Xn are n independent r.v.s, such that for each one P(Xi = 1) = p,

n

P(Xi = 0) = 1 ā’ p, where p ā [0, 1]. The random variable Sn = i=1 Xi is called a

binomial r.v., and represents, for example, the number of successes in n trials, where the

probability of a success is p. An important result in probability is that

n!

pk (1 ā’ p)nā’k .

P(Sn = k) =

k!(n ā’ k)!

The variance of a random variable is

Var X = E [(X ā’ E X)2 ].

This is also equal to

E [X 2 ] ā’ (E X)2 .

It is an easy consequence of the multiplication theorem that if X and Y are independent,

Var (X + Y ) = Var X + Var Y.

The expression E [X 2 ] is sometimes called the second moment of X.

We close this section with a deļ¬nition of conditional probability. The probability

of A given B, written P(A | B) is deļ¬ned by

P(A ā© B)

,

P(B)

provided P(B) = 0. The conditional expectation of X given B is deļ¬ned to be

E [X; B]

,

P(B)

ńņš. 1 |