<<

. 23
( 59 .)



>>



2. SUBGRADIENTS

If f is a real-valued function de¬ned on a set C in RL, then the graph of f is
the set of points (x, z) in RL> of the form (x, f (x)), where x + C. If f is a
differentiable function, then given a point x in the interior of C, the graph of

f has a tangent plane at (x , f (x )) whose normal vector is (
f (x ), 91).
  
If C is convex and f is convex, f need not be differentiable at an interior
point of C, as the function f de¬ned by f (x) : "x" shows. In this section we
shall introduce a generalization of the gradient vector, called the subgradient,
that exists at points (x, f (x)) of the graph of a convex function f, where x is
an interior point of C. We shall show that if f is a convex function de¬ned
on C and x is an interior point of C, then epi( f ) has a supporting hyperplane
at the point (x, f (x)). The supporting hyperplane need not be unique. In
the next section we shall show that f is differentiable at x if and only if f has
a unique subgradient at x. This subgradient turns out to be
f (x). Thus, if f
is differentiable at x, there is a unique supporting hyperplane to epi( f ) at
(x, f (x)). This hyperplane is the tangent plane to the graph at (x, f (x)). If
( , 91) is a normal vector to a supporting hyperplane, then has some of the
properties that the gradient vector would have.
We now de¬ne the notion of subgradient.

De¬nition 2.1. A function f de¬ned on a set C is said to have a subgradient at
a point x in C if there exists a vector in RL such that

f (x) . f (x ) ; 1 , x 9 x 2
 
for all x + C. The vector is called a subgradient.
Geometrically, is a subgradient of f at x if the graph of f in RL> lies on

or above the graph of the hyperplane y : f (x );1 , x9x 2. Since (x , f (x ))
   
is in this hyperplane, this hyperplane will be a supporting hyperplane to epi( f )
at (x , f (x )). Thus, the existence of a subgradient is a statement about the
 
existence of a nonvertical supporting hyperplane to epi( f ). In Figure 3.5 we
illustrate the de¬nition with the function f (x) : "x" at x : 0.
Any line through the origin with slope satisfying 91 - - 1 has the
property that

"x" . x : "0" ; (x 9 0)

for all x. Thus any in the interval [91, 1] is a subgradient of "x" at x : 0. If
x 9 0, then : 1 is the only subgradient of "x" at x ; if x : 0, then : 91
  
is the only subgradient of "x" at x .

De¬nition 2.2. The set of subgradients of a function f at a point x is called

the subdifferential of f at x and is denoted by *f (x ).
 
SUBGRADIENTS 103




Figure 3.5.



The subdifferential of "x" at x : 0 is the interval [91, 1]. The subdifferential
of "x" at x 9 0 is the singleton +1,, and the subdifferential at x : 0 is +91,.
 
Example 2.1. The subdifferential will be the empty set if no subgradient exists.
This may happen, as the following example shows. Let C : [91, 1] and let
f (x) : 9(1 9 x. (Sketch the graph.) The graph of f has a nonvertical
tangent line at any point x in (91, 1). From the graph it appears that for such

x the graph lies above the tangent line, and so *f (x ) : + f (x ),. Although
  
this can be veri¬ed in this example by relatively straightforward calculations,
we shall not do so. The validity of the statement will follow from general
theorems that we will prove. At the points x : <1, the graph has a vertical
tangent line, but the tangent line does not lie below the graph. Thus it appears
that f does not have a subgradient at these points. We now show this
analytically.
A number is a subgradient of f at x : 91 if and only if

9(1 9 x . (x ; 1) for all x in [91, 1].

For x in (91, 1] this inequality is equivalent to

1 9 x 
9 . x in (91, 1].
1;x
104 CONVEX FUNCTIONS


If we let x in (91, 1] tend to 91, the left-hand side of this inequality tends to
9-. Hence there is no + R such that the subgradient inequality holds for all
x in [91, 1].
A similar argument shows that *f (1) : `.

T 2.1. L et f be a convex function de¬ned on a convex set C. T hen at
each point x in int(C) there exists a vector in RL such that

f (x) . f (x ) ; 1 , x 9 x 2 (1)
 
for all x in C. In other words, *f (x ) " ` at every interior point x of C.
 
Example 2.1 shows that the requirement x + int(C) is essential.

Proof. We shall show that the conclusion of the theorem is a consequence
of the existence of a supporting hyperplane to epi( f ) at (x , f (x )).
 
The point (x , f (x )) is a boundary point of epi( f ). Since epi( f ) is convex,
 
we may apply Theorem II.4.1 with C : epi( f ) and get that there exists a vector
(a, ) " (0, 0) in RL> such that

1(a, ), (x, y)2 - 1(a, ), (x , f (x ))2
 
for all (x, y) in epi( f ). This relation is equivalent to

1a, x2 ; y - 1a, x 2 ; f (x ) (2)
 
for all (x, y) in epi( f ), that is, for all x in C and y . f (x).
Since the right-hand side of (2) is a constant, if were strictly positive, we
could obtain a contradiction by letting y become arbitrarily large. Hence - 0.
If : 0, then a " 0 and (2) becomes

1a, x2 - 1a, x 2 all x + C. (3)

Since x + int(C), there exists an 9 0 such that x : x ; a belongs to C.
  
Setting x : x in (3) gives 1a, a2 - 0, which is a contradiction. Thus, : 0.

In (2) we now divide through by : 0 and set y : f (x) to get


 
a
f (x) . f (x ) ; ,x 9x
 

for all x + C. Now set : 9a/ to get (1).

C¬¬ 1. If f is strictly convex, then strict inequality holds in (1).

We leave the proof as an exercise.
SUBGRADIENTS 105


The converse of Theorem 2.1 is false. To see this, let C be the square in R,
+x : 0 - x - 1, 0 - x - 1,. Let f (x) : 0 for x such that 0 - x - 1 and
  
0 : x - 1. Let f (x) :  9 "x 9 " for x such that 0 - x - 1 and x : 0. The
 
  
function f has a unique subgradient : 0 at every point in int(C), yet f is not
convex on C. The function f, however, is convex on int(C), and this phenom-
enon illustrates the general state of affairs.

T 2.2. L et C be a convex set in RL. If at each point x in C f has a

subgradient, then f is convex on C.

Remark 2.1. If we merely assume that at each point in int(C) the subdifferential
is not empty, then we can only conclude that f is convex on int(C).

Proof. Let x and y be two points in C and let z : x ; y, 9 0, 9 0,
; : 1. Since f has a subgradient at z, say ,

f (x) . f (z) ; 1 , x 9 z2 : f (z) ; 1 , x 9 y2,
f (y) . f (z) ; 1 , y 9 z2 : f (z) 9 1 , x 9 y2.

If we now multiply the ¬rst inequality by 9 0, the second inequality by 9 0,
and add the results, we get

f (x) ; f (y) . f (z) : f ( x ; y).

Hence f is convex.

Remark 2.2. The proof of the theorem shows that if at each x in C there exists

a subgradient such that f (x) 9 f (x ) ; 1 , x 9 x 2, for x " x , x in C, then
  
f is strictly convex on C.

Exercise 2.1. Show that if *f (x ) is not empty, then *f (x ) is closed and convex.
 
Exercise 2.2. Prove Corollary 1 to Theorem 2.1.

Exercise 2.3. Let f be a convex function de¬ned on a convex set C and let
x + int(C). Show that + *f (x ) if and only if f (x , v) . 1 , v2 for all v in RL.
  
Exercise 2.4. Let f be convex and concave on RL. Show that f is af¬ne, that
is, f (x) : 1a, x2 ; b.

Exercise 2.5. Let f (x):#x#. Show that *f (0):+ : # #-1,. F (From Theorem
3.1 we will get that, for x " 0, *f (x) : +x/#x#,).
106 CONVEX FUNCTIONS


Exercise 2.6. Let f (x , x ) : "x ; x ". Show that f is convex and calculate
  
*f ((0, 0)).


3. DIFFERENTIABLE CONVEX FUNCTIONS

We ¬rst show that if f is a convex function and differentiable [in the sense of
relation (5) in Section 11 of Chapter I] at an interior point of its domain, then
there is a unique subgradient at the point and the subgradient is the gradient
vector, and conversely, if f has a unique subgradient at x, then f is differenti-
able at x.

T 3.1. L et f be a convex function de¬ned on a convex set C in RL, and let
x be an interior point of C. T hen f is differentiable at x if and only if f has a
unique subgradient at x. Moreover, the unique element in the subgradient is

f (x).

Proof. Let f be differentiable at x. Since f is convex and x + int(C), by
Theorem 2.1 the subgradient *f (x) is not empty. Let + *f (x). Then for any h
in RL and suf¬ciently small t 9 0

f (x ; th) . f (x) ; 1 , th2.

Since f is differentiable at x, we have that for h " 0

f (x ; th) : f (x) ; 1
f (x), th2 ; (th),

where (th)/t#h# ; 0 as t ; 0. If we subtract the second relation from the ¬rst,
we get

0 . t[1 , h2 9 1
f (x, h2 9 (th)#h#/t#h#].

If we now divide by t 9 0 and then let t ; 0, we get that 1 9
f (x), h2 - 0
for all h " 0 in RL. In particular, if 9
f (x) " 0, then this inequality holds for
h : 9
f (x), whence # 9
f (x)# - 0. Thus, :
f (x).
We now suppose that the subgradient *f (x) consists of a unique element .
We will show that all the ¬rst-order partial derivatives of f exist at x. The
differentiability of f at x will then follow from Theorem 1.4.
To simplify the notation, we assume tha x : 0 and that f (x) : 0. There is
no loss of generality in making this assumption, since this amounts to a
translation of the origin in RL ; R to (x, f (x)). Let us now suppose that at least
one partial derivative of f at 0 does not exist. For de¬niteness let us assume
that it is the partial with respect to x . Let

g(t) : f (0 ; te ),

DIFFERENTIABLE CONVEX FUNCTIONS 107


where e is the ¬rst standard basis vector (1, 0, . . . , 0). Since f is convex and

0 + int(C), the function g is de¬ned and convex on some maximal interval I
containing 0 in its interior. Moreover, *f /*x does not exist at 0 if and only if

g(0) does not exist. Since g is convex, it follows from Corollary 2 to Theorem
1.2 that g (0) and g (0) exist and that g (0) - g (0). Therefore, if g(0) does
\ > \ >
not exist, then g (0) : g (0).
\ >
Let be any number satisfying

g (0) : : g (0). (1)

<<

. 23
( 59 .)



>>