<<

. 24
( 59 .)



>>

\ >

we can ¬nd a in *f (0),
We shall show that corresponding to each such

with distinct determining distinct . This will contradict the hypothesis that

*f (0) consists of a unique element, and the proof will be accomplished.
It follows from (1) and Corollary 1.1 that for h 9 0

g(9h) 9 g(0) g(h) 9 g(0)
- g (0) : : g (0) - .
\ >

9h h

Since g(0) : f (0) : 0, we get from the preceding inequalities that for all h in I

f (he ) : g(h) 9 h, (2)
 
with equality holding only if h : 0. Let V denote the one-dimensional sub-

space spanned by e . Thus V : +x : x : te , t + R,. De¬ne a linear functional
  
L on V by the formula
 
L (x) : L (te ) : t. (3)
  
Since the representation x : te is unique for x in V , the linear functional L
  
is well de¬ned. Note that L depends on . Combining (2) and (3), we get that
 
for x in V 5 C

f (x) . L (x) : t. (4)
 
If n : 1, we are through. Otherwise, let V denote the two-dimensional

subspace spanned by e and e . Thus,
 
V : +y : y : te ; ue , t, u + R, : +y : y : x ; ue , x + V , u + R,.
    
We shall extend L to a linear functional L de¬ned on V and shall ¬nd a
  
such that for all y in V 5 C
number
 
f (y) . L (y) : t ; u. (5)
  
108 CONVEX FUNCTIONS


Let x and z belong to V 5 C and let 9 0, 9 0. It then follows from the

linearity of L , from the convexity of V 5 C, from (4), and from the convexity
 
of f that


L (x) ; L (z) : L x; z
  
; ; ; ;

-f x; z
; ;

:f (x 9 e ) ; (z ; e )
 
; ;

- f (x 9 e ) ; f (z ; e ).
 
(;) ;
D  
Multiplying through by ; gives

L (x) ; L (z) - f (x 9 e ) ; f (z ; e ),
   
and so

L (x) 9 f (x 9 e ) f (z ; e ) 9 L (z)
 -  . (6)


Denote the left-hand side of (6) by (x, ) and denote the right-hand side of (6)
by (z, ). Hence

sup+ (x, ) : x + V 5 C, 9 0, - inf+ (z, ) : z + V 5 C, 9 0,
 
and both the supremum and in¬mum are ¬nite. Hence there exists a number
such that

L (x) 9 f (x 9 e ) f (x ; e ) 9 L (x)
 -  
- (7)


for all x + V 5 C, 9 0, 9 0.

Now let y be an element in V 5 C that is not in V . Then y : x ; ue with
  
x : te and u " 0. If u 9 0, take : u in (7); if u : 0, take : 9u in (7). Then

from (7) and (3) we get that for all y in V 5 C

f (y) : f (x ; ue ) . L (x) ; u : t ; u,
    
which is (5).
DIFFERENTIABLE CONVEX FUNCTIONS 109


satisfying (1) there exists a
Proceeding inductively, we get that for each

vector : ( , , . . . , ) such that if x : (x , . . . , x ) is in C, then
 L  L
f (x) . 1 , x2

Since f (0) : 0, this says that + *f (0).

The next theorem is an immediate consequence of Theorems 2.1, 2.2, and
3.1.

T 3.2. L et C be an open convex set in RL and let f be real valued and
differentiable on C. T hen f is convex if and only if for each x in C

f (x) . f (x ) ; 1
f (x ), x 9 x 2
  
for all x in C. Also, f is strictly convex if and only if for each x in C and all

x " x in C

f (x) 9 f (x ) ; 1
f (x ), x 9 x 2.
  
Remark 3.1. If n : 2 and f is differentiable at x , then the surface in R de¬ned

by y : f (x) has a tangent plane at x whose equation can be written as

y 9 f (x ) : 1
f (x ), x 9 x 2.
  
Thus, the geometric interpretation of Theorem 3.2 is that a function f that is
differentiable on C is convex on C if and only if for each point x in C the

surface de¬ned by y : f (x) lies above the tangent plane to the surface at x .

This is not surprising since we obtained the result in Theorem 3.2 by
considering the supporting hyperplanes to epi( f ), and for differentiable f, these
are the tangent planes to the surface.

Let f be a real-valued function de¬ned and of class C  on an open set D
in RL. (See Section 11 in Chapter I for the de¬nition of class C  .) For such
functions we have a computationally feasible criterion for convexity. In the
ensuing discussion recall that, by Theorem I.11.3, f is differentiable at each
point x in D.

We ¬rst develop our criterion if D is an interval I 3 R.

L 3.1. L et f be of class C  on an open interval I 3 R. T hen f is convex on
I if and only if f (x) . 0 for all x in I. If f (x) 9 0 or all x in I, then f is strictly
convex.

Note that if f is strictly convex, then we cannot conclude that f (x) 9 0 for
all x, as the example f (x) : x shows. The function f is strictly convex, yet
f (0) : 0.
110 CONVEX FUNCTIONS


Proof. If f is convex on I, then by Corollary 3 to Theorem 1.2, f (x) . 0
on I. Conversely, suppose f (x) . 0 on I. Let x be a ¬xed point in I and let

x be any other point in I. Then by Taylor™s theorem

f (x) 9 f (x ) 9 f (x )(x 9 x ) :  f (x )(x 9 x ) . 0,

   * 

where x is a point between x and x. It now follows from Theorem 3.2 that
* 
f is convex on I. If f (x) 9 0 on I it follows from Theorem 3.2 that f is strictly
convex.

Recall that a symmetric n ; n matrix A with entries (a ) is said to be positive
GH
semide¬nite if the quadratic form

L
xRAx : 1x, Ax2 :  a x
GH GH
G H

is nonnegative for all x in RL. The matrix A is said to be positive de¬nite if
1x, Ax2 is positive for all x " 0 in RL.

T 3.3. L et f be of class C  on an open convex set D. T hen f is convex
on D if and only if the Hessian matrix

*f
H(x) : (x)
*x *x
HG

is positive semide¬nite at each point x in D. If H(x) is positive de¬nite at each x,
then f is strictly convex.
Proof. Let x be a point in D. Since D is open, for each vector v in RL, v " 0,
there exists an 9 0 such that the line segment (x 9 v, x ; v) is in D. Hence
the function ( ; x, v) in relation (3) in Section 1 is de¬ned on (9 , ). Also,
since f is of class C  on D, *f (x)/*x *x : *f (x)/*x *x at each x in D, and
HG GH
so H(x) is symmetric.
If f is convex, then by Lemma 1.5, the function ( ; x, v) is convex on the
interval (9 , ). By Lemma 3.1, (0; x, v) . 0. If v : (v , . . . , v ), then straight-
 L
forward calculations as in relations (12) and (13) in Section 11 of Chapter I
give

d L *f
(t; x, v) : f (x ; tv) :  (x ; tv)v ,
G
dt *x
G G
L L *f
(t; x, v) :   f (x ; tv)v v : 1v, H(x ; tv)v2. (8)
GH
*x *x
G H H G
DIFFERENTIABLE CONVEX FUNCTIONS 111


Setting t : 0 gives

1v, H(x)v2 : (0; x, v) . 0. (9)

Since (9) holds for all v, H(x) is positive semide¬nite.
Conversely, suppose now that H(x) is positive semide¬nite for each x in D.
Then from (8), for each x in D and for each v in RL, (t ; x, v) . 0 for all t in
(9 , ). Hence by Lemma 3.1, ( ; x, v) is convex. Since x and v are arbitrary,
it follows from Lemma 1.5 that f is convex on D. The argument also shows
that if H(x) is positive de¬nite for all x in D, then f is strictly convex.

We assume that the reader is familiar with the following facts. Given a
symmetric matrix A, there exists an orthogonal matrix P such that PRAP : D,
where D is a diagonal matrix whose diagonal elements are the eigenvalues of
A. Thus A is positive de¬nite if and only if all the diagonal entries of D
(eigenvalues of A) are positive and A is positive semide¬nite if and only if they
are nonnegative. For a survey of numerical methods for obtaining the eigen-
values of a symmetric matrix see Golub and Van Loan [1996].
Let A be an n ; n matrix with entries a . Let
GH

a %a
 I
:a , : det $ \$, k : 2, 3, 4, . . . , n.
  I
a %a
I II

are called the principal minors of A. Another criterion for
The determinants
I
positive de¬niteness of a symmetric matrix A with entries a is the following.
GH

L 3.2. T he symmetric matrix A is positive de¬nite if and only if 9 0 for
I
k : 1, . . . , n and is negative de¬nite if and only if (91)I 9 0 for k : 1, . . . , n.
I

<<

. 24
( 59 .)



>>