rn ≡ f ’ Λ u (15.52)

where Λ is the discretization of the differential operator L and where f is the array of grid

point values of f (x) in the boundary value problem

(15.53)

Lu = f

15.8. MRR METHOD 305

The matrices of the MRR are related to Λ and f through the preconditioning matrix H via

’1

A≡H (15.54a)

Λ

’1

g≡H (15.54b)

f

The goal is to choose „ n so as to minimize the norm of the residual, i. e., create the smallest

possible value of

rn+1 , rn+1 (15.55)

To compute the optimum „ , it is helpful to introduce the auxiliary vector

’1 ’1 ’1

z ≡H f ’H

n n

Λ un (15.56)

r =H

It is easy to show (try it!) that

rn+1 = rn ’ „ Λ z n (15.57)

The solution to this minimization problem turns out to be the same as that of a series

expansion, so we brie¬‚y digress for the following:

Theorem 29 (Mean-Square Minimization with a Truncated Series) Suppose the goal is to

approximate a function f (x) with a truncated series so as to minimize

(15.58)

(Rn , Rn )

where the residual (error) is

N

RN (x) ≡ f (x) ’ (15.59)

an φn

n=0

The choice that minimizes the mean-square error is

(15.60)

an = (f, φn )/(φn , φn )

of N so long as the basis functions are orthogonal.

INDEPENDENT

Proof: Substitute the de¬nition of Rn into (15.58) and then differentiate with respect to each

of the (N + 1) coef¬cients in turn and set the result equal to 0. (Recall that the condition for

a function of N + 1 variables to be a minimum is that ‚(Rn , Rn )/‚a0 = ‚(Rn , Rn )/‚a1 =

. . . = ‚(Rn , Rn )/‚an = 0.) After the differentiations have been performed, (15.60) is obvi-

ous. QED

Eq. (15.60) is identical with the formula for the coef¬cients of an in¬nite series of or-

thogonal functions, (3.24). Although we are not performing a spectral expansion but rather

solving a matrix algebra problem via the MRR, comparison of (15.57) with (15.59) shows

that the minimization problem of choosing the “time step” „ n is identical in form to that of

computing a one-term, mean-square approximation to a function. Here, rn plays the role of

f (x) and Λ z n is the “basis function”, impersonating φ0 (x). The choice of „ that minimizes

the residual is therefore

„ n = (rn , Λ z n )/(Λ z n , Λ z n ) (15.61)

CHAPTER 15. MATRIX-SOLVING METHODS

306

De¬nition 34 (Minimum Residual Richardson™s (MRR) Algorithm)

To solve the matrix problem

(15.62)

Λ u = f,

using the preconditioning matrix H, given a ¬rst guess u0 ):

Initialization:

≡ f ’ Λ u0

r0 (15.63a)

≡ H ’1 r0

z0 (15.63b)

and then repeat the following steps until convergence:

Iteration:

„n (rn , Λ z n )/(Λ z n , Λ z n ) (15.64a)

=

un+1 = un + „ n z n (15.64b)

= rn ’ „ n Λ z n

rn+1 (15.64c)

’1

z n+1 rn+1 (15.64d)

=H

’1

The iteration converges as long as the eigenvalues of H Λ lie strictly in the right half of the complex

plane.

Because the only additional quantity needed at each step is the scalar „ n , which is computed by

two inexpensive inner products, the cost per iteration of the MRR is usually no more than a few

percent greater than that the unmodi¬ed Richardson™s method.

Fig. 15.4, from Canuto and Quarteroni (1985), compares the accuracy of the unmodi¬ed

Richardson™s method with the MRR method. The graphs are clear testimony to the power

of minimizing the residual: four iterations of Richardson™s method reduce the error by no

more than a single MRR iteration.

Both cases are dif¬cult in the sense that the ¬rst derivative terms are very large so that

Orszag™s estimates for the condition number are too optimistic. (Recall that, following

Orszag, we neglected all derivatives except the highest in estimating the condition num-

ber.) Instead, Canuto and Quarteroni computed the exact condition numbers and used

them to optimize the Richardson™s method. In real-life where the eigenvalues are not

known, one would try Orszag™s optimal timestep and the Richardson™s iteration would

blow up. Trial-and-error would eventually give a stable „ , but probably not one which is

optimal, so the graphs in Fig. 15.4 make the unmodi¬ed Richardson™s algorithm look better

than it would in practice.

In contrast, the MRR method is parameter-free, so the rapid convergence seen in Fig. 15.4

would always be achieved for these problems. This is particularly important for Case 2,

which is a two-dimensional boundary value problem where the preconditioning matrix

H is an incomplete factorization of the ¬nite difference matrix, and therefore has a rather

large condition number.

15.9. DELVES-FREEMAN BLOCK-AND-DIAGONAL ITERATION 307

Figure 15.4: Convergence of Richardson™s iteration (dashed) versus the Minimum Resid-

ual Richardson™s method (solid) for two problems. The y-axis is the logarithm of

(r, r )/(f , f ) where the residual r ≡ f ’ Λ u as in the text and (, ) denotes the usual

matrix inner product. Taken from Canuto & Quarteroni (1985). The differential equation is

’ [(1 + 10 x) ux ]x = f (x). It was solved with 127 grid points on [’1, 1] and f (x) chosen so

the exact solution is u(x) = sin(πx).

15.9 The Delves-Freeman “Asymptotically Diagonal” Pre-

conditioned Iteration

The monograph by Delves and Freeman (1981) and the short review of Delves (1976) an-

alyze an alternative preconditioning which is based on two key ideas. The ¬rst idea is to

precondition using a low order approximation to the Galerkin matrix. (In contrast, the ¬nite

difference preconditioner is a low order grid point approximation.)

The second key idea is to exploit the matrix property of being “asymptotically diago-

nal”. Galerkin approximations are usually not “diagonally dominant” in the sense that is

so valuable in iterations for ¬nite difference methods (Birkhoff and Lynch, 1984). However,

in many cases, the diagonal element of the j-th row of the Galerkin matrix does become

more and more important relative to the other elements in the same row and column as

j ’ ∞. In this sense, the Galerkin matrix is “asymptotically diagonal”.

An example will clarify this concept. Consider the Fourier cosine-Galerkin solution of

(15.65)

uxxxx + q(x) u = f (x)

where u(x), q(x), and f (x) are all smooth, periodic functions symmetric about the origin.

(The symmetry is assumed only for simplicity.) The Galerkin discretization is

(15.66)

Ha = f

where a is the column vector containing the Fourier cosine coef¬cients, ai , and

(15.67)

Hij = (cos[ix], [‚xxxx + q(x)] cos[jx])

(15.68)

fi = (cos[ix], f )

CHAPTER 15. MATRIX-SOLVING METHODS

308

Figure 15.5: Same as Fig. 15.4 except that the differential equation is

’[(1 + 10 x2 y 2 )ux ]x ’ [(1 + 10 x2 y 2 )uy ]y = f (x, y)

where f (x, y) is chosen so that the exact solution is u(x, y) = sin(πx) sin(πy).

where

π

(15.69)

(a, b) = a(x) b(x) dx

0

The crucial point is that if we de¬ne

qij ≡ (cos[ix], q(x) cos[jx]) (15.70)

then

Hij = j 4 δij + qij (15.71)

where δij is the usual Kronecker delta (=1 if i = j and 0 otherwise). The fourth derivative

contributes only to the diagonal elements and its contribution increases rapidly with the

row or column number. In contrast, the matrix elements of q(x) are O(1) at most3 . It follows

that the rows and columns of the Galerkin matrix H will be increasingly dominated by the

huge diagonal elements as i, j ’ ∞.

Table 15.4 shows the Galerkin matrix for a particular (arbitrary) choice of q(x). To make

the diagonal dominance clearer, it is helpful to de¬ne a rescaled matrix H via

Hij

(15.72)

hij =

Hii Hjj

that |qij | ¤ π max |q(x)| since the integral de¬ning these matrix elements is bounded by the maximum

3 Note

of its integrand, multiplied by the length of the integration interval. This estimate is conservative: The qij for any

smooth q(x) decrease rapidly as either i, j, or both together increase.

15.9. DELVES-FREEMAN BLOCK-AND-DIAGONAL ITERATION 309

Table 15.4: The upper left 8 — 8 block of the Galerkin matrix H. This particular case is the

Fourier cosine solution of uxxxx + [1 + 10 cosh(4 cos(x) + 0.4)] u = exp(’ cos(x)).

i\j 0 1 2 3 4 5 6 7

0 123.2 56.7 98.2 19.4 21.7 2.9 2.4 0.2

1 56.7 193.6 53.8 84.7 15.8 17.0 2.2 1.8

2 98.2 53.8 154.5 42.2 71.1 13.9 15.4 2.1

3 19.4 84.7 42.2 205.9 40.3 69.5 13.7 15.3

4 21.7 15.8 71.1 40.3 379.3 40.1 69.4 13.7

5 2.9 17.0 13.9 69.5 40.1 748.2 40.1 69.4

6 2.4 2.2 15.4 13.7 69.4 40.1 1419.2 40.1

7 0.2 1.8 2.1 15.3 13.7 69.4 40.1 2524.2

Table 15.5: The upper left 12 — 12 block of the rescaled Galerkin matrix h, multiplied by

1000.

i\j 0 1 2 3 4 5 6 7 8 9 10 11

0 1000 367 712 122 100 10 6 0 0 0 0 0

1 367 1000 311 424 58 45 4 3 0 0 0 0

2 712 311 1000 236 294 41 33 3 2 0 0 0

3 122 424 236 1000 144 177 25 21 2 1 0 0

4 100 58 294 144 1000 75 95 14 12 1 1 0

5 10 45 41 177 75 1000 39 51 8 7 1 1

6 6 4 33 25 95 39 1000 21 28 4 4 0

7 0 3 3 21 14 51 21 1000 12 17 3 3

8 0 0 2 2 12 8 28 12 1000 8 11 2

9 0 0 0 1 1 7 4 17 8 1000 5 7

10 0 0 0 0 1 1 4 3 11 5 1000 3

11 0 0 0 0 0 1 0 3 2 7 3 1000

CHAPTER 15. MATRIX-SOLVING METHODS

310

which makes the diagonal elements of h equal to unity while preserving the symmetry of

the matrix. Table 15.5 shows h.

Unfortunately, H is only “asymptotically diagonal” because when the row and column

indices are small, qij is the same order of magnitude as the diagonal elements. Delves and

Freeman (1981) showed that one can exploit “asymptotic diagonality” by partitioning the

Galerkin matrix H into

(15.73)

H = HD (M ) + HQ (M )

where HD (M ) is the upper left-hand M — M block of H plus all the diagonal elements and

HQ contains all the off-diagonal elements outside the upper left M —M block. When M = 3

and N = 6, for example,

H12 H13 0 0 0

H11

H21 H22 H23 0 0 0