ńņš. 1(čē 19 ńņš.)ĪĆĖĄĀĖÅĶČÅ Ńėåäóžłą’ >>
The Basics of Financial Mathematics

Spring 2003

Richard F. Bass
Department of Mathematics
University of Connecticut

These notes are c 2003 by Richard Bass. They may be used for personal use or
class use, but not for commercial purposes. If you ļ¬nd any errors, I would appreciate
hearing from you: bass@math.uconn.edu

1
1. Introduction.
In this course we will study mathematical ļ¬nance. Mathematical ļ¬nance is not
about predicting the price of a stock. What it is about is ļ¬guring out the price of options
and derivatives.
The most familiar type of option is the option to buy a stock at a given price at
a given time. For example, suppose Microsoft is currently selling today at \$40 per share.
A European call option is something I can buy that gives me the right to buy a share of
Microsoft at some future date. To make up an example, suppose I have an option that
allows me to buy a share of Microsoft for \$50 in three months time, but does not compel
me to do so. If Microsoft happens to be selling at \$45 in three months time, the option is
worthless. I would be silly to buy a share for \$50 when I could call my broker and buy it
for \$45. So I would choose not to exercise the option. On the other hand, if Microsoft is
selling for \$60 three months from now, the option would be quite valuable. I could exercise
the option and buy a share for \$50. I could then turn around and sell the share on the
open market for \$60 and make a proļ¬t of \$10 per share. Therefore this stock option I
possess has some value. There is some chance it is worthless and some chance that it will
lead me to a proļ¬t. The basic question is: how much is the option worth today?
The huge impetus in ļ¬nancial derivatives was the seminal paper of Black and Scholes
in 1973. Although many researchers had studied this question, Black and Scholes gave a
deļ¬nitive answer, and a great deal of research has been done since. These are not just
academic questions; today the market in ļ¬nancial derivatives is larger than the market
in stock securities. In other words, more money is invested in options on stocks than in
stocks themselves.
Options have been around for a long time. The earliest ones were used by manu-
facturers and food producers to hedge their risk. A farmer might agree to sell a bushel of
wheat at a ļ¬xed price six months from now rather than take a chance on the vagaries of
market prices. Similarly a steel reļ¬nery might want to lock in the price of iron ore at a
ļ¬xed price.
The sections of these notes can be grouped into ļ¬ve categories. The ļ¬rst is elemen-
tary probability. Although someone who has had a course in undergraduate probability
will be familiar with some of this, we will talk about a number of topics that are not usu-
ally covered in such a course: Ļ-ļ¬elds, conditional expectations, martingales. The second
category is the binomial asset pricing model. This is just about the simplest model of a
stock that one can imagine, and this will provide a case where we can see most of the major
ideas of mathematical ļ¬nance, but in a very simple setting. Then we will turn to advanced
probability, that is, ideas such as Brownian motion, stochastic integrals, stochastic diļ¬er-
ential equations, Girsanov transformation. Although to do this rigorously requires measure
theory, we can still learn enough to understand and work with these concepts. We then

2
return to ļ¬nance and work with the continuous model. We will derive the Black-Scholes
formula, see the Fundamental Theorem of Asset Pricing, work with equivalent martingale
measures, and the like. The ļ¬fth main category is term structure models, which means
models of interest rate behavior.
I found some unpublished notes of Steve Shreve extremely useful in preparing these
notes. I hope that he has turned them into a book and that this book is now available.
The stochastic calculus part of these notes is from my own book: Probabilistic Techniques
in Analysis, Springer, New York, 1995.
I would also like to thank Evarist GinĀ“ who pointed out a number of errors.
e

3
2. Review of elementary probability.
Letā™s begin by recalling some of the deļ¬nitions and basic concepts of elementary
probability. We will only work with discrete models at ļ¬rst.
We start with an arbitrary set, called the probability space, which we will denote
by ā„¦, the capital Greek letter āomega.ā We are given a class F of subsets of ā„¦. These are
called events. We require F to be a Ļ-ļ¬eld.

Deļ¬nition 2.1. A collection F of subsets of ā„¦ is called a Ļ-ļ¬eld if

ā… ā F,
(1)
ā„¦ ā F,
(2)
A ā F implies Ac ā F, and
(3)
A1 , A2 , . . . ā F implies both āŖā Ai ā F and ā©ā Ai ā F.
(4) i=1 i=1

Here Ac = {Ļ ā ā„¦ : Ļ ā A} denotes the complement of A. ā… denotes the empty set, that
/
is, the set with no elements. We will use without special comment the usual notations of
āŖ (union), ā© (intersection), ā‚ (contained in), ā (is an element of).
Typically, in an elementary probability course, F will consist of all subsets of
ā„¦, but we will later need to distinguish between various Ļ-ļ¬elds. Here is an exam-
ple. Suppose one tosses a coin two times and lets ā„¦ denote all possible outcomes. So
ā„¦ = {HH, HT, T H, T T }. A typical Ļ-ļ¬eld F would be the collection of all subsets of ā„¦.
In this case it is trivial to show that F is a Ļ-ļ¬eld, since every subset is in F. But if
we let G = {ā…, ā„¦, {HH, HT }, {T H, T T }}, then G is also a Ļ-ļ¬eld. One has to check the
deļ¬nition, but to illustrate, the event {HH, HT } is in G, so we require the complement of
that set to be in G as well. But the complement is {T H, T T } and that event is indeed in
G.
One point of view which we will explore much more fully later on is that the Ļ-ļ¬eld
tells you what events you āknow.ā In this example, F is the Ļ-ļ¬eld where you āknowā
everything, while G is the Ļ-ļ¬eld where you āknowā only the result of the ļ¬rst toss but not
the second. We wonā™t try to be precise here, but to try to add to the intuition, suppose
one knows whether an event in F has happened or not for a particular outcome. We
would then know which of the events {HH}, {HT }, {T H}, or {T T } has happened and so
would know what the two tosses of the coin showed. On the other hand, if we know which
events in G happened, we would only know whether the event {HH, HT } happened, which
means we would know that the ļ¬rst toss was a heads, or we would know whether the event
{T H, T T } happened, in which case we would know that the ļ¬rst toss was a tails. But
there is no way to tell what happened on the second toss from knowing which events in G
happened. Much more on this later.

The third basic ingredient is a probability.

4
Deļ¬nition 2.2. A function P on F is a probability if it satisļ¬es

if A ā F, then 0 ā¤ P(A) ā¤ 1,
(1)
(2) P(ā„¦) = 1, and
(3) P(ā…) = 0, and
ā
if A1 , A2 , . . . ā F are pairwise disjoint, then P(āŖā Ai ) =
(4) P(Ai ).
i=1 i=1

A collection of sets Ai is pairwise disjoint if Ai ā© Aj = ā… unless i = j.
There are a number of conclusions one can draw from this deļ¬nition. As one
example, if A ā‚ B, then P(A) ā¤ P(B) and P(Ac ) = 1 ā’ P(A). See Note 1 at the end of
this section for a proof.
Someone who has had measure theory will realize that a Ļ-ļ¬eld is the same thing
as a Ļ-algebra and a probability is a measure of total mass one.

A random variable (abbreviated r.v.) is a function X from ā„¦ to R, the reals. To
be more precise, to be a r.v. X must also be measurable, which means that {Ļ : X(Ļ) ā„
a} ā F for all reals a.
The notion of measurability has a simple deļ¬nition but is a bit subtle. If we take
the point of view that we know all the events in G, then if Y is G-measurable, then we
know Y . Phrased another way, suppose we know whether or not the event has occurred
for each event in G. Then if Y is G-measurable, we can compute the value of Y .
Here is an example. In the example above where we tossed a coin two times, let X
be the number of heads in the two tosses. Then X is F measurable but not G measurable.
To see this, let us consider Aa = {Ļ ā ā„¦ : X(Ļ) ā„ a}. This event will equal

if a ā¤ 0;
ļ£“ā„¦
ļ£±
{HH, HT, T H} if 0 < a ā¤ 1;
ļ£²
ļ£“ {HH} if 1 < a ā¤ 2;
ļ£³
ā… if 2 < a.

For example, if a = 2 , then the event where the number of heads is 3 or greater is the
3
2
event where we had two heads, namely, {HH}. Now observe that for each a the event Aa
is in F because F contains all subsets of ā„¦. Therefore X is measurable with respect to F.
3
However it is not true that Aa is in G for every value of a ā“ take a = 2 as just one example
ā“ the subset {HH} is not in G. So X is not measurable with respect to the Ļ-ļ¬eld G.

A discrete r.v. is one where P(Ļ : X(Ļ) = a) = 0 for all but countably many aā™s,
say, a1 , a2 , . . ., and i P(Ļ : X(Ļ) = ai ) = 1. In deļ¬ning sets one usually omits the Ļ;
thus (X = x) means the same as {Ļ : X(Ļ) = x}.
In the discrete case, to check measurability with respect to a Ļ-ļ¬eld F, it is enough
that (X = a) ā F for all reals a. The reason for this is that if x1 , x2 , . . . are the values of

5
x for which P(X = x) = 0, then we can write (X ā„ a) = āŖxi ā„a (X = xi ) and we have a
countable union. So if (X = xi ) ā F, then (X ā„ a) ā F.
Given a discrete r.v. X, the expectation or mean is deļ¬ned by

EX = xP(X = x)
x

provided the sum converges. If X only takes ļ¬nitely many values, then this is a ļ¬nite sum
and of course it will converge. This is the situation that we will consider for quite some
time. However, if X can take an inļ¬nite number of values (but countable), convergence
needs to be checked. For example, if P(X = 2n ) = 2ā’n for n = 1, 2, . . ., then E X =
ā ā’n
n
n=1 2 Ā· 2 = ā.
There is an alternate deļ¬nition of expectation which is equivalent in the discrete
setting. Set
EX = X(Ļ)P({Ļ}).
Ļāā„¦

To see that this is the same, look at Note 2 at the end of the section. The advantage of the
second deļ¬nition is that some properties of expectation, such as E (X + Y ) = E X + E Y ,
are immediate, while with the ļ¬rst deļ¬nition they require quite a bit of proof.
We say two events A and B are independent if P(A ā© B) = P(A)P(B). Two random
variables X and Y are independent if P(X ā A, Y ā B) = P(X ā A)P(X ā B) for all A
and B that are subsets of the reals. The comma in the expression P(X ā A, Y ā B) means
āand.ā Thus
P(X ā A, Y ā B) = P((X ā A) ā© (Y ā B)).
The extension of the deļ¬nition of independence to the case of more than two events or
random variables is not surprising: A1 , . . . , An are independent if

P(Ai1 ā© Ā· Ā· Ā· ā© Aij ) = P(Ai1 ) Ā· Ā· Ā· P(Aij )

whenever {i1 , . . . , ij } is a subset of {1, . . . , n}.
A common misconception is that an event is independent of itself. If A is an event
that is independent of itself, then

P(A) = P(A ā© A) = P(A)P(A) = (P(A))2 .

The only ļ¬nite solutions to the equation x = x2 are x = 0 and x = 1, so an event is
independent of itself only if it has probability 0 or 1.
Two Ļ-ļ¬elds F and G are independent if A and B are independent whenever A ā F
and B ā G. A r.v. X and a Ļ-ļ¬eld G are independent if P((X ā A) ā© B) = P(X ā A)P(B)
whenever A is a subset of the reals and B ā G.

6
As an example, suppose we toss a coin two times and we deļ¬ne the Ļ-ļ¬elds G1 =
{ā…, ā„¦, {HH, HT }, {T H, T T }} and G2 = {ā…, ā„¦, {HH, T H}, {HT, T T }}. Then G1 and G2 are
independent if P(HH) = P(HT ) = P(T H) = P(T T ) = 1 . (Here we are writing P(HH)
4
when a more accurate way would be to write P({HH}).) An easy way to understand this
is that if we look at an event in G1 that is not ā… or ā„¦, then that is the event that the ļ¬rst
toss is a heads or it is the event that the ļ¬rst toss is a tails. Similarly, a set other than ā…
or ā„¦ in G2 will be the event that the second toss is a heads or that the second toss is a
tails.

If two r.v.s X and Y are independent, we have the multiplication theorem, which
says that E (XY ) = (E X)(E Y ) provided all the expectations are ļ¬nite. See Note 3 for a
proof.
Suppose X1 , . . . , Xn are n independent r.v.s, such that for each one P(Xi = 1) = p,
n
P(Xi = 0) = 1 ā’ p, where p ā [0, 1]. The random variable Sn = i=1 Xi is called a
binomial r.v., and represents, for example, the number of successes in n trials, where the
probability of a success is p. An important result in probability is that

n!
pk (1 ā’ p)nā’k .
P(Sn = k) =
k!(n ā’ k)!

The variance of a random variable is

Var X = E [(X ā’ E X)2 ].

This is also equal to
E [X 2 ] ā’ (E X)2 .

It is an easy consequence of the multiplication theorem that if X and Y are independent,

Var (X + Y ) = Var X + Var Y.

The expression E [X 2 ] is sometimes called the second moment of X.
We close this section with a deļ¬nition of conditional probability. The probability
of A given B, written P(A | B) is deļ¬ned by

,
P(B)

provided P(B) = 0. The conditional expectation of X given B is deļ¬ned to be

E [X; B]
,
P(B)

 ńņš. 1(čē 19 ńņš.)ĪĆĖĄĀĖÅĶČÅ Ńėåäóžłą’ >>