. 1
( 19 .)



>>

The Basics of Financial Mathematics

Spring 2003


Richard F. Bass
Department of Mathematics
University of Connecticut

These notes are c 2003 by Richard Bass. They may be used for personal use or
class use, but not for commercial purposes. If you ¬nd any errors, I would appreciate
hearing from you: bass@math.uconn.edu




1
1. Introduction.
In this course we will study mathematical ¬nance. Mathematical ¬nance is not
about predicting the price of a stock. What it is about is ¬guring out the price of options
and derivatives.
The most familiar type of option is the option to buy a stock at a given price at
a given time. For example, suppose Microsoft is currently selling today at $40 per share.
A European call option is something I can buy that gives me the right to buy a share of
Microsoft at some future date. To make up an example, suppose I have an option that
allows me to buy a share of Microsoft for $50 in three months time, but does not compel
me to do so. If Microsoft happens to be selling at $45 in three months time, the option is
worthless. I would be silly to buy a share for $50 when I could call my broker and buy it
for $45. So I would choose not to exercise the option. On the other hand, if Microsoft is
selling for $60 three months from now, the option would be quite valuable. I could exercise
the option and buy a share for $50. I could then turn around and sell the share on the
open market for $60 and make a pro¬t of $10 per share. Therefore this stock option I
possess has some value. There is some chance it is worthless and some chance that it will
lead me to a pro¬t. The basic question is: how much is the option worth today?
The huge impetus in ¬nancial derivatives was the seminal paper of Black and Scholes
in 1973. Although many researchers had studied this question, Black and Scholes gave a
de¬nitive answer, and a great deal of research has been done since. These are not just
academic questions; today the market in ¬nancial derivatives is larger than the market
in stock securities. In other words, more money is invested in options on stocks than in
stocks themselves.
Options have been around for a long time. The earliest ones were used by manu-
facturers and food producers to hedge their risk. A farmer might agree to sell a bushel of
wheat at a ¬xed price six months from now rather than take a chance on the vagaries of
market prices. Similarly a steel re¬nery might want to lock in the price of iron ore at a
¬xed price.
The sections of these notes can be grouped into ¬ve categories. The ¬rst is elemen-
tary probability. Although someone who has had a course in undergraduate probability
will be familiar with some of this, we will talk about a number of topics that are not usu-
ally covered in such a course: σ-¬elds, conditional expectations, martingales. The second
category is the binomial asset pricing model. This is just about the simplest model of a
stock that one can imagine, and this will provide a case where we can see most of the major
ideas of mathematical ¬nance, but in a very simple setting. Then we will turn to advanced
probability, that is, ideas such as Brownian motion, stochastic integrals, stochastic di¬er-
ential equations, Girsanov transformation. Although to do this rigorously requires measure
theory, we can still learn enough to understand and work with these concepts. We then

2
return to ¬nance and work with the continuous model. We will derive the Black-Scholes
formula, see the Fundamental Theorem of Asset Pricing, work with equivalent martingale
measures, and the like. The ¬fth main category is term structure models, which means
models of interest rate behavior.
I found some unpublished notes of Steve Shreve extremely useful in preparing these
notes. I hope that he has turned them into a book and that this book is now available.
The stochastic calculus part of these notes is from my own book: Probabilistic Techniques
in Analysis, Springer, New York, 1995.
I would also like to thank Evarist Gin´ who pointed out a number of errors.
e




3
2. Review of elementary probability.
Let™s begin by recalling some of the de¬nitions and basic concepts of elementary
probability. We will only work with discrete models at ¬rst.
We start with an arbitrary set, called the probability space, which we will denote
by „¦, the capital Greek letter “omega.” We are given a class F of subsets of „¦. These are
called events. We require F to be a σ-¬eld.

De¬nition 2.1. A collection F of subsets of „¦ is called a σ-¬eld if

… ∈ F,
(1)
„¦ ∈ F,
(2)
A ∈ F implies Ac ∈ F, and
(3)
A1 , A2 , . . . ∈ F implies both ∪∞ Ai ∈ F and ©∞ Ai ∈ F.
(4) i=1 i=1

Here Ac = {ω ∈ „¦ : ω ∈ A} denotes the complement of A. … denotes the empty set, that
/
is, the set with no elements. We will use without special comment the usual notations of
∪ (union), © (intersection), ‚ (contained in), ∈ (is an element of).
Typically, in an elementary probability course, F will consist of all subsets of
„¦, but we will later need to distinguish between various σ-¬elds. Here is an exam-
ple. Suppose one tosses a coin two times and lets „¦ denote all possible outcomes. So
„¦ = {HH, HT, T H, T T }. A typical σ-¬eld F would be the collection of all subsets of „¦.
In this case it is trivial to show that F is a σ-¬eld, since every subset is in F. But if
we let G = {…, „¦, {HH, HT }, {T H, T T }}, then G is also a σ-¬eld. One has to check the
de¬nition, but to illustrate, the event {HH, HT } is in G, so we require the complement of
that set to be in G as well. But the complement is {T H, T T } and that event is indeed in
G.
One point of view which we will explore much more fully later on is that the σ-¬eld
tells you what events you “know.” In this example, F is the σ-¬eld where you “know”
everything, while G is the σ-¬eld where you “know” only the result of the ¬rst toss but not
the second. We won™t try to be precise here, but to try to add to the intuition, suppose
one knows whether an event in F has happened or not for a particular outcome. We
would then know which of the events {HH}, {HT }, {T H}, or {T T } has happened and so
would know what the two tosses of the coin showed. On the other hand, if we know which
events in G happened, we would only know whether the event {HH, HT } happened, which
means we would know that the ¬rst toss was a heads, or we would know whether the event
{T H, T T } happened, in which case we would know that the ¬rst toss was a tails. But
there is no way to tell what happened on the second toss from knowing which events in G
happened. Much more on this later.

The third basic ingredient is a probability.

4
De¬nition 2.2. A function P on F is a probability if it satis¬es

if A ∈ F, then 0 ¤ P(A) ¤ 1,
(1)
(2) P(„¦) = 1, and
(3) P(…) = 0, and

if A1 , A2 , . . . ∈ F are pairwise disjoint, then P(∪∞ Ai ) =
(4) P(Ai ).
i=1 i=1


A collection of sets Ai is pairwise disjoint if Ai © Aj = … unless i = j.
There are a number of conclusions one can draw from this de¬nition. As one
example, if A ‚ B, then P(A) ¤ P(B) and P(Ac ) = 1 ’ P(A). See Note 1 at the end of
this section for a proof.
Someone who has had measure theory will realize that a σ-¬eld is the same thing
as a σ-algebra and a probability is a measure of total mass one.

A random variable (abbreviated r.v.) is a function X from „¦ to R, the reals. To
be more precise, to be a r.v. X must also be measurable, which means that {ω : X(ω) ≥
a} ∈ F for all reals a.
The notion of measurability has a simple de¬nition but is a bit subtle. If we take
the point of view that we know all the events in G, then if Y is G-measurable, then we
know Y . Phrased another way, suppose we know whether or not the event has occurred
for each event in G. Then if Y is G-measurable, we can compute the value of Y .
Here is an example. In the example above where we tossed a coin two times, let X
be the number of heads in the two tosses. Then X is F measurable but not G measurable.
To see this, let us consider Aa = {ω ∈ „¦ : X(ω) ≥ a}. This event will equal

if a ¤ 0;
„¦
±
{HH, HT, T H} if 0 < a ¤ 1;

 {HH} if 1 < a ¤ 2;

… if 2 < a.

For example, if a = 2 , then the event where the number of heads is 3 or greater is the
3
2
event where we had two heads, namely, {HH}. Now observe that for each a the event Aa
is in F because F contains all subsets of „¦. Therefore X is measurable with respect to F.
3
However it is not true that Aa is in G for every value of a “ take a = 2 as just one example
“ the subset {HH} is not in G. So X is not measurable with respect to the σ-¬eld G.

A discrete r.v. is one where P(ω : X(ω) = a) = 0 for all but countably many a™s,
say, a1 , a2 , . . ., and i P(ω : X(ω) = ai ) = 1. In de¬ning sets one usually omits the ω;
thus (X = x) means the same as {ω : X(ω) = x}.
In the discrete case, to check measurability with respect to a σ-¬eld F, it is enough
that (X = a) ∈ F for all reals a. The reason for this is that if x1 , x2 , . . . are the values of

5
x for which P(X = x) = 0, then we can write (X ≥ a) = ∪xi ≥a (X = xi ) and we have a
countable union. So if (X = xi ) ∈ F, then (X ≥ a) ∈ F.
Given a discrete r.v. X, the expectation or mean is de¬ned by

EX = xP(X = x)
x

provided the sum converges. If X only takes ¬nitely many values, then this is a ¬nite sum
and of course it will converge. This is the situation that we will consider for quite some
time. However, if X can take an in¬nite number of values (but countable), convergence
needs to be checked. For example, if P(X = 2n ) = 2’n for n = 1, 2, . . ., then E X =
∞ ’n
n
n=1 2 · 2 = ∞.
There is an alternate de¬nition of expectation which is equivalent in the discrete
setting. Set
EX = X(ω)P({ω}).
ω∈„¦

To see that this is the same, look at Note 2 at the end of the section. The advantage of the
second de¬nition is that some properties of expectation, such as E (X + Y ) = E X + E Y ,
are immediate, while with the ¬rst de¬nition they require quite a bit of proof.
We say two events A and B are independent if P(A © B) = P(A)P(B). Two random
variables X and Y are independent if P(X ∈ A, Y ∈ B) = P(X ∈ A)P(X ∈ B) for all A
and B that are subsets of the reals. The comma in the expression P(X ∈ A, Y ∈ B) means
“and.” Thus
P(X ∈ A, Y ∈ B) = P((X ∈ A) © (Y ∈ B)).
The extension of the de¬nition of independence to the case of more than two events or
random variables is not surprising: A1 , . . . , An are independent if

P(Ai1 © · · · © Aij ) = P(Ai1 ) · · · P(Aij )

whenever {i1 , . . . , ij } is a subset of {1, . . . , n}.
A common misconception is that an event is independent of itself. If A is an event
that is independent of itself, then

P(A) = P(A © A) = P(A)P(A) = (P(A))2 .

The only ¬nite solutions to the equation x = x2 are x = 0 and x = 1, so an event is
independent of itself only if it has probability 0 or 1.
Two σ-¬elds F and G are independent if A and B are independent whenever A ∈ F
and B ∈ G. A r.v. X and a σ-¬eld G are independent if P((X ∈ A) © B) = P(X ∈ A)P(B)
whenever A is a subset of the reals and B ∈ G.

6
As an example, suppose we toss a coin two times and we de¬ne the σ-¬elds G1 =
{…, „¦, {HH, HT }, {T H, T T }} and G2 = {…, „¦, {HH, T H}, {HT, T T }}. Then G1 and G2 are
independent if P(HH) = P(HT ) = P(T H) = P(T T ) = 1 . (Here we are writing P(HH)
4
when a more accurate way would be to write P({HH}).) An easy way to understand this
is that if we look at an event in G1 that is not … or „¦, then that is the event that the ¬rst
toss is a heads or it is the event that the ¬rst toss is a tails. Similarly, a set other than …
or „¦ in G2 will be the event that the second toss is a heads or that the second toss is a
tails.

If two r.v.s X and Y are independent, we have the multiplication theorem, which
says that E (XY ) = (E X)(E Y ) provided all the expectations are ¬nite. See Note 3 for a
proof.
Suppose X1 , . . . , Xn are n independent r.v.s, such that for each one P(Xi = 1) = p,
n
P(Xi = 0) = 1 ’ p, where p ∈ [0, 1]. The random variable Sn = i=1 Xi is called a
binomial r.v., and represents, for example, the number of successes in n trials, where the
probability of a success is p. An important result in probability is that

n!
pk (1 ’ p)n’k .
P(Sn = k) =
k!(n ’ k)!

The variance of a random variable is

Var X = E [(X ’ E X)2 ].

This is also equal to
E [X 2 ] ’ (E X)2 .

It is an easy consequence of the multiplication theorem that if X and Y are independent,

Var (X + Y ) = Var X + Var Y.

The expression E [X 2 ] is sometimes called the second moment of X.
We close this section with a de¬nition of conditional probability. The probability
of A given B, written P(A | B) is de¬ned by

P(A © B)
,
P(B)

provided P(B) = 0. The conditional expectation of X given B is de¬ned to be

E [X; B]
,
P(B)

. 1
( 19 .)



>>