(15.74)

HD =

0 0 0 H44 0 0

0 0 0 0 H55 0

0 0 0 0 0 H66

0 0 0 H14 H15 H16

0 0 0 H24 H25 H26

0 0 0 H34 H35 H36

(15.75)

HQ =

H41 H42 H43 0 H45 H46

H51 H52 H53 H54 0 H56

H61 H62 H63 H64 H65 0

The Delves-Freeman iteration is then

HD an+1 = f ’ HQ an (15.76)

or equivalently

’1

rn = f ’ H an

an+1 = an + HD rn (15.77)

&

which is identical in form to the preconditioned Richardson™s iteration. (Only the matrices

are different.)

One crucial observation is that because rn is the residual of the differential equation, we

do not need to explicitly multiply an by the full Galerkin matrix H to compute it. Instead

” just as when using the grid point values as the unknowns ” the Fast Fourier Transform

can be applied to evaluate rn indirectly. The cost of computing the residual is the same as

for the ¬nite difference preconditioning.

To invert HD (M ), write HD (M ) in partitioned block form as

„¦ 0

(15.78)

HD =

0 Υ

where „¦ is an M — M matrix, Υ is a diagonal matrix whose elements are Υij = Hjj δij , and

the 0™s in (15.78) represent blocks of zeros. Then

’1

’1

„¦ 0

(15.79)

HD = ’1

0 Υ

15.9. DELVES-FREEMAN BLOCK-AND-DIAGONAL ITERATION 311

Figure 15.6: Norm of the corrections to the coef¬cients for the Delves-Freeman iteration for

M = 1, 2, . . . , 8 where M is the size of the upper left block of HD . The top curve (iteration

diverges) is M = 1 (HD is diagonal). The rate of convergence increases monotonically with

M . The ODE is uxxxx + [1 + 10 cosh(4 cos(x) + 4)] u = exp(’ cos(x)) [same example as

Tables 15.4 & 15.5]. Fourier cosine basis with N = 32.

’1

where Υ is the diagonal matrix whose elements are simply the reciprocals of the diagonal

elements of H. Thus, the work in (15.79) is the cost of factoring an M — M dense matrix,

which is O([2/3]M 3 ), plus (N ’ M ) divisions.

Thus, if M N , the cost per iteration is dominated by the O(N log2 N ) cost of com-

puting the residual rn via the Fast Fourier Transform ” the same as for ¬nite difference

preconditioning.

Fig. 15.6 shows that when M = 1 (topmost graph), the iteration diverges. This is strong

proof that the Galerkin matrix is not “diagonally dominant”! For M > 2, however, the

iteration converges geometrically just like all the other variants of Richardson™s method.

The rate of convergence increases rapidly with M , especially when M is small, so it

is a good idea to use a moderate value for M rather than the smallest M which gives

convergence.

The concept generalizes to partial differential equations and other basis sets, too. For ex-

ample, cos(kx) cos(my) is an eigenfunction of the Laplace operator, so the Fourier-Galerkin

representation of

2

(15.80)

u + q(x, y) u = f (x, y)

is “asymptotically diagonal”. The only complication is that one should use the “speedome-

ter” ordering of unknowns so that the columns and rows are numbered such that the small-

est i, j correspond to the smallest values of i2 + j 2 .

The Delves-Freeman idea works well with certain non-Fourier basis sets. For example,

CHAPTER 15. MATRIX-SOLVING METHODS

312

Hermite functions ψn (y) satisfy the eigenrelation

d2

ψn (y) = y 2 ’ (2n + 1) ψn (y) (15.81)

dy 2

The second derivative operator is not diagonal, but its n-dependent part contributes only

to the diagonal. This implies that the Hermite-Galerkin matrix for

(15.82)

uyy + q(y) u = f (y)

is asymptotically diagonal.

Unfortunately, there are complications for the most popular application ” Chebyshev

polynomials ” because the Chebyshev-Galerkin representation of derivatives is not di-

agonal. With suf¬cient ingenuity, it is still possible to apply the same general idea or an

extension. For example, in the next section, we show that the Chebyshev solution of

(15.83)

uxx + q u = f (x)

may be converted into the solution of a bordered tridiagonal matrix (for constant q). When

q varies with x, the Galerkin matrix is dense, but “asymptotically tridiagonal”. One can

generalize HD by replacing the diagonal matrix by a tridiagonal matrix; this block-plus-

tridiagonal matrix is still cheap to invert (Appendix B).

In multiple dimensions with a Chebyshev basis, the natural extension of the Delves-

Freeman method is to iterate using the Galerkin representation of a separable problem. As

explained below in Sec. 15.11, the Galerkin representation of a separable elliptic problem

has, with a little ingenuity, a simple form that allows very cheap iteration. The resulting it-

eration for nonseparable PDEs is preconditioned in spectral space. However, unlike Delves

and Freeman™s simple conception, the low degree basis functions are not treated differently

from the rest of the basis.

In multiple dimensions with a Fourier or spherical harmonics basis, a simple block-

plus-diagonal iteration of Delves and Freeman type may be very effective if the unknowns

are ordered so that the low degree basis functions are clustered in the ¬rst few rows and

columns.

The Delves-Freeman algorithm can also be extended to nonlinear problems through the

Nonlinear Richardson™s Iteration strategy which is described in Sec. 15.14. Boyd(1997d) is

a successful illustration with a Fourier basis in one space dimension.

The moral of the story is that one can precondition either on the pseudospectral grid

or in spectral space. The residual may be evaluated by the Fast Fourier Transform equally

cheaply in either case. The decision between Orszag™s ¬nite difference iteration and the

Delves-Freeman Galerkin iteration should be based on which of these choices leads to a

preconditioning matrix, HD , which is more easily inverted for a given problem.

15.10 Recursions & Formal Integration: Constant Coef¬cient

ODEs

Clenshaw (1957) observed that for simple constant coef¬cient differential equations, Cheby-

shev non-interpolating methods generate sparse matrices. It is simplest to illustrate the idea

by example. Consider

uxx ’ q u = f (x) (15.84)

15.10. RECURSIONS & FORMAL INTEGRATION: CONSTANT COEFFICIENT ODES313

where q is a constant. The pseudospectral discretization of Eq.(15.84) is dense. In con-

trast, Galerkin™s method (using Chebyshev polynomials as “test” functions, not boundary-

vanishing basis functions) gives a nearly triangular matrix. The reason is that

k’2

1

k (k 2 ’ m2 ) Tm (x) (15.85)

Tk,xx =

c

m=0 m

where cm = 2 if m = 0 and cm = 1 otherwise and where the sum is restricted to poly-

nomials of the same parity as Tk (x). Consequently, (Tm , Tk,xx ) = 0 for all m > (k ’ 2).

Unfortunately, the cost of backsolving is still O(N 2 ). The problem is that the derivative of a

Chebyshev polynomial involves all Chebyshev polynomials of the same parity and lower

degree.

Clenshaw observed that the formula for the integral of a Chebyshev polynomial in-

volves just two polynomials while the double integral involves only three:

1 Tn+1 (x) Tn’1 (x)

’ n≥2 (15.86)

Tn (x) dx =

n’1

2 n+1

This formula is actually equivalent to (15.39), the recurrence for computing the coef¬cients

of df /dx from those of f (x). Clenshaw himself exploited (15.39) to manipulate the Galerkin

equations into tridiagonal form, but there are two other ways of obtaining the same result.

One is to apply the Mean-Weighted Residual method using the second derivatives of

the Chebyshev polynomials as the “test” functions. These derivatives are the Gegenbauer

polynomials of second order (as explained in Appendix A), so they are orthogonal, im-

plying that the contributions of the second derivative to the Galerkin™s matrix appear only

on the diagonal, and they are complete, implying that these Gegenbauer polynomials are

legitimate “test” functions. Via recurrence relations between Gegenbauer polynomials of

different orders, Tm (x) may be written as a linear combination of the three Gegenbauer

polynomials of degrees (m ’ 2, m, m + 2) so that the Galerkin matrix has only three non-

zero elements in each row.

The third justi¬cation ” completely equivalent to the ¬rst two ” is to formally inte-

grate the equation twice to obtain

u’q (15.87)

u= f (t) + A + B x

where A and B are arbitary constants determined by the boundary conditions. If we then

demand that (15.87) should be orthogonal to T2 (x), . . . , Tn (x), we obtain

cn’2 σn σn+2

an’2 ’ an ’ (15.88)

an + q an+2

4n(n ’ 1) 2 ’ 1)

2(n 4n(n + 1)

cn’2 σn σn+2

fn’2 ’ fn ’

= fn+2

4n(n ’ 1) 2(n ’ 1) 4n(n + 1)

where σn = 1 for n < N ’ 1 and 0 for all larger N and where the fn are the Chebyshev

coef¬cients of the forcing function, f (x).

The resulting matrix contains two full rows to impose the boundary conditions, so it

is only quasi-tridiagonal. However, the methods for “bordered” matrices (Appendix B)

compute the coef¬cients an in “roughly the number of operations required to solve penta-

diagonal systems of equations”, to quote Gottlieb & Orszag (1977, pg. 120). Canuto et al.

(1988) give a good discussion of generalizations to other boundary conditions and so on.

Clenshaw and others applied his recursive techniques to many different equations. If

the coef¬cients are polynomials, the result will again be a banded matrix. If q includes a

CHAPTER 15. MATRIX-SOLVING METHODS

314

term in x2 , the matrix is quasi-pentadiagonal. If q has a term in x4 , the matrix is quasi-

heptadiagonal, and so on.

The most complete archive of early work is Chapter 5 of Fox & Parker (1968). The

authors point out that Clenshaw™s strategy not only creates sparsity, but also may improve

accuracy. The reason is that the Chebyshev series of a derivative always converges more

slowly than that of u(x) itself. After integration, accuracy is no longer limited by that of

the slowly convergent series for the second derivative, but only by that of u(x) itself.

Of course, as stressed many times above, factors of N 2 are irrelevant for exponentially

convergent approximations when N is large. The extra accuracy and sparse matrices pro-

duced by the double integration are most valuable for paper-and-pencil calculations, or

when N is small. In consequence, these integrate-the-ODE methods have gone rather out

of fashion.

Zebib (1984) is a return to this idea: he assumes a Chebyshev series for the highest

derivative (rather than u(x) itself) and obtains formulas for the contributions of lower

derivatives by applying (15.85). Although this procedure is complicated ” especially for

fourth order equations ” it both improves accuracy and eliminates very large, unphysical

complex eigenvalues from the Chebyshev discretization of the Orr-Sommerfeld equation

(Zebib, 1987b).

Coutsias, Hagstrom and Torres(1996) have extended Clenshaw™s method to all the stan-

dard orthogonal polynomials including Laguerre and Hermite polynomials. They also pro-

vide useful identities, recursions, and estimates of condition number. Clenshaw™s method

gives banded matrices when the coef¬cients of the ODE are rational functions as well as

when the coef¬cients are polynomials. Coutsias et al. show that by changing the coor-

dinate using a rational function, it is possible to resolve narrow boundary layers without

sacri¬cing a banded matrix representation, as illustrated with numerical examples.

Thus, the integrate-the-ODE methods have a small but useful role in spectral methods

even today.

15.11 Direct Methods for Separable PDE™s

The earliest work on spectral methods for separable PDE™s is Haidvogel and Zang (1979).

They offer two algorithms for exploiting the separability, one iterative and one direct. We

shall discuss only the latter.

The key to the Haidvogel-Zang algorithm is a procedure which can also be applied to

one-dimensional problems, so we will begin with the ODE case. When

[ p(x) ux ]x ’ q u = f (x) (15.89)

is discretized, it generates the matrix problem

Λx ’ q I a = f (15.90)

where I is the identity matrix. We could absorb the constant q into Λx , of course, but have

split it off for future convenience.

If the collocation matrix Λx is diagonalizable, we can write

’1

˜ (15.91)

Λx = Ex Λx Ex

15.11. DIRECT METHODS FOR SEPARABLE PDE™S 315

˜

where Ex is the matrix whose columns are the n eigenvectors of Λx and where Λx is the

diagonal matrix whose elements are the corresponding eigenvalues. Eq. (15.90) becomes

’1

˜

Ex (Λx ’ q I ) Ex a = f (15.92)

’1

Multiplying through by Ex gives

’1 ’1

˜

(Λx ’ q I ) Ex a = Ex f (15.93)

De¬ne

’1 ’1

b ≡ Ex a g ≡ Ex f (15.94)

;