<<

. 65
( 136 .)



>>

H31 H32 H33 0 0 0
(15.74)
HD =
0 0 0 H44 0 0
0 0 0 0 H55 0
0 0 0 0 0 H66

0 0 0 H14 H15 H16
0 0 0 H24 H25 H26
0 0 0 H34 H35 H36
(15.75)
HQ =
H41 H42 H43 0 H45 H46
H51 H52 H53 H54 0 H56
H61 H62 H63 H64 H65 0
The Delves-Freeman iteration is then
HD an+1 = f ’ HQ an (15.76)
or equivalently
’1
rn = f ’ H an
an+1 = an + HD rn (15.77)
&
which is identical in form to the preconditioned Richardson™s iteration. (Only the matrices
are different.)
One crucial observation is that because rn is the residual of the differential equation, we
do not need to explicitly multiply an by the full Galerkin matrix H to compute it. Instead
” just as when using the grid point values as the unknowns ” the Fast Fourier Transform
can be applied to evaluate rn indirectly. The cost of computing the residual is the same as
for the ¬nite difference preconditioning.
To invert HD (M ), write HD (M ) in partitioned block form as

„¦ 0
(15.78)
HD =
0 Υ

where „¦ is an M — M matrix, Υ is a diagonal matrix whose elements are Υij = Hjj δij , and
the 0™s in (15.78) represent blocks of zeros. Then
’1
’1
„¦ 0
(15.79)
HD = ’1
0 Υ
15.9. DELVES-FREEMAN BLOCK-AND-DIAGONAL ITERATION 311




Figure 15.6: Norm of the corrections to the coef¬cients for the Delves-Freeman iteration for
M = 1, 2, . . . , 8 where M is the size of the upper left block of HD . The top curve (iteration
diverges) is M = 1 (HD is diagonal). The rate of convergence increases monotonically with
M . The ODE is uxxxx + [1 + 10 cosh(4 cos(x) + 4)] u = exp(’ cos(x)) [same example as
Tables 15.4 & 15.5]. Fourier cosine basis with N = 32.

’1
where Υ is the diagonal matrix whose elements are simply the reciprocals of the diagonal
elements of H. Thus, the work in (15.79) is the cost of factoring an M — M dense matrix,
which is O([2/3]M 3 ), plus (N ’ M ) divisions.
Thus, if M N , the cost per iteration is dominated by the O(N log2 N ) cost of com-
puting the residual rn via the Fast Fourier Transform ” the same as for ¬nite difference
preconditioning.
Fig. 15.6 shows that when M = 1 (topmost graph), the iteration diverges. This is strong
proof that the Galerkin matrix is not “diagonally dominant”! For M > 2, however, the
iteration converges geometrically just like all the other variants of Richardson™s method.
The rate of convergence increases rapidly with M , especially when M is small, so it
is a good idea to use a moderate value for M rather than the smallest M which gives
convergence.
The concept generalizes to partial differential equations and other basis sets, too. For ex-
ample, cos(kx) cos(my) is an eigenfunction of the Laplace operator, so the Fourier-Galerkin
representation of
2
(15.80)
u + q(x, y) u = f (x, y)

is “asymptotically diagonal”. The only complication is that one should use the “speedome-
ter” ordering of unknowns so that the columns and rows are numbered such that the small-
est i, j correspond to the smallest values of i2 + j 2 .
The Delves-Freeman idea works well with certain non-Fourier basis sets. For example,
CHAPTER 15. MATRIX-SOLVING METHODS
312

Hermite functions ψn (y) satisfy the eigenrelation

d2
ψn (y) = y 2 ’ (2n + 1) ψn (y) (15.81)
dy 2

The second derivative operator is not diagonal, but its n-dependent part contributes only
to the diagonal. This implies that the Hermite-Galerkin matrix for

(15.82)
uyy + q(y) u = f (y)

is asymptotically diagonal.
Unfortunately, there are complications for the most popular application ” Chebyshev
polynomials ” because the Chebyshev-Galerkin representation of derivatives is not di-
agonal. With suf¬cient ingenuity, it is still possible to apply the same general idea or an
extension. For example, in the next section, we show that the Chebyshev solution of

(15.83)
uxx + q u = f (x)

may be converted into the solution of a bordered tridiagonal matrix (for constant q). When
q varies with x, the Galerkin matrix is dense, but “asymptotically tridiagonal”. One can
generalize HD by replacing the diagonal matrix by a tridiagonal matrix; this block-plus-
tridiagonal matrix is still cheap to invert (Appendix B).
In multiple dimensions with a Chebyshev basis, the natural extension of the Delves-
Freeman method is to iterate using the Galerkin representation of a separable problem. As
explained below in Sec. 15.11, the Galerkin representation of a separable elliptic problem
has, with a little ingenuity, a simple form that allows very cheap iteration. The resulting it-
eration for nonseparable PDEs is preconditioned in spectral space. However, unlike Delves
and Freeman™s simple conception, the low degree basis functions are not treated differently
from the rest of the basis.
In multiple dimensions with a Fourier or spherical harmonics basis, a simple block-
plus-diagonal iteration of Delves and Freeman type may be very effective if the unknowns
are ordered so that the low degree basis functions are clustered in the ¬rst few rows and
columns.
The Delves-Freeman algorithm can also be extended to nonlinear problems through the
Nonlinear Richardson™s Iteration strategy which is described in Sec. 15.14. Boyd(1997d) is
a successful illustration with a Fourier basis in one space dimension.
The moral of the story is that one can precondition either on the pseudospectral grid
or in spectral space. The residual may be evaluated by the Fast Fourier Transform equally
cheaply in either case. The decision between Orszag™s ¬nite difference iteration and the
Delves-Freeman Galerkin iteration should be based on which of these choices leads to a
preconditioning matrix, HD , which is more easily inverted for a given problem.


15.10 Recursions & Formal Integration: Constant Coef¬cient
ODEs
Clenshaw (1957) observed that for simple constant coef¬cient differential equations, Cheby-
shev non-interpolating methods generate sparse matrices. It is simplest to illustrate the idea
by example. Consider

uxx ’ q u = f (x) (15.84)
15.10. RECURSIONS & FORMAL INTEGRATION: CONSTANT COEFFICIENT ODES313

where q is a constant. The pseudospectral discretization of Eq.(15.84) is dense. In con-
trast, Galerkin™s method (using Chebyshev polynomials as “test” functions, not boundary-
vanishing basis functions) gives a nearly triangular matrix. The reason is that
k’2
1
k (k 2 ’ m2 ) Tm (x) (15.85)
Tk,xx =
c
m=0 m

where cm = 2 if m = 0 and cm = 1 otherwise and where the sum is restricted to poly-
nomials of the same parity as Tk (x). Consequently, (Tm , Tk,xx ) = 0 for all m > (k ’ 2).
Unfortunately, the cost of backsolving is still O(N 2 ). The problem is that the derivative of a
Chebyshev polynomial involves all Chebyshev polynomials of the same parity and lower
degree.
Clenshaw observed that the formula for the integral of a Chebyshev polynomial in-
volves just two polynomials while the double integral involves only three:
1 Tn+1 (x) Tn’1 (x)
’ n≥2 (15.86)
Tn (x) dx =
n’1
2 n+1
This formula is actually equivalent to (15.39), the recurrence for computing the coef¬cients
of df /dx from those of f (x). Clenshaw himself exploited (15.39) to manipulate the Galerkin
equations into tridiagonal form, but there are two other ways of obtaining the same result.
One is to apply the Mean-Weighted Residual method using the second derivatives of
the Chebyshev polynomials as the “test” functions. These derivatives are the Gegenbauer
polynomials of second order (as explained in Appendix A), so they are orthogonal, im-
plying that the contributions of the second derivative to the Galerkin™s matrix appear only
on the diagonal, and they are complete, implying that these Gegenbauer polynomials are
legitimate “test” functions. Via recurrence relations between Gegenbauer polynomials of
different orders, Tm (x) may be written as a linear combination of the three Gegenbauer
polynomials of degrees (m ’ 2, m, m + 2) so that the Galerkin matrix has only three non-
zero elements in each row.
The third justi¬cation ” completely equivalent to the ¬rst two ” is to formally inte-
grate the equation twice to obtain

u’q (15.87)
u= f (t) + A + B x

where A and B are arbitary constants determined by the boundary conditions. If we then
demand that (15.87) should be orthogonal to T2 (x), . . . , Tn (x), we obtain
cn’2 σn σn+2
an’2 ’ an ’ (15.88)
an + q an+2
4n(n ’ 1) 2 ’ 1)
2(n 4n(n + 1)
cn’2 σn σn+2
fn’2 ’ fn ’
= fn+2
4n(n ’ 1) 2(n ’ 1) 4n(n + 1)
where σn = 1 for n < N ’ 1 and 0 for all larger N and where the fn are the Chebyshev
coef¬cients of the forcing function, f (x).
The resulting matrix contains two full rows to impose the boundary conditions, so it
is only quasi-tridiagonal. However, the methods for “bordered” matrices (Appendix B)
compute the coef¬cients an in “roughly the number of operations required to solve penta-
diagonal systems of equations”, to quote Gottlieb & Orszag (1977, pg. 120). Canuto et al.
(1988) give a good discussion of generalizations to other boundary conditions and so on.
Clenshaw and others applied his recursive techniques to many different equations. If
the coef¬cients are polynomials, the result will again be a banded matrix. If q includes a
CHAPTER 15. MATRIX-SOLVING METHODS
314

term in x2 , the matrix is quasi-pentadiagonal. If q has a term in x4 , the matrix is quasi-
heptadiagonal, and so on.
The most complete archive of early work is Chapter 5 of Fox & Parker (1968). The
authors point out that Clenshaw™s strategy not only creates sparsity, but also may improve
accuracy. The reason is that the Chebyshev series of a derivative always converges more
slowly than that of u(x) itself. After integration, accuracy is no longer limited by that of
the slowly convergent series for the second derivative, but only by that of u(x) itself.
Of course, as stressed many times above, factors of N 2 are irrelevant for exponentially
convergent approximations when N is large. The extra accuracy and sparse matrices pro-
duced by the double integration are most valuable for paper-and-pencil calculations, or
when N is small. In consequence, these integrate-the-ODE methods have gone rather out
of fashion.
Zebib (1984) is a return to this idea: he assumes a Chebyshev series for the highest
derivative (rather than u(x) itself) and obtains formulas for the contributions of lower
derivatives by applying (15.85). Although this procedure is complicated ” especially for
fourth order equations ” it both improves accuracy and eliminates very large, unphysical
complex eigenvalues from the Chebyshev discretization of the Orr-Sommerfeld equation
(Zebib, 1987b).
Coutsias, Hagstrom and Torres(1996) have extended Clenshaw™s method to all the stan-
dard orthogonal polynomials including Laguerre and Hermite polynomials. They also pro-
vide useful identities, recursions, and estimates of condition number. Clenshaw™s method
gives banded matrices when the coef¬cients of the ODE are rational functions as well as
when the coef¬cients are polynomials. Coutsias et al. show that by changing the coor-
dinate using a rational function, it is possible to resolve narrow boundary layers without
sacri¬cing a banded matrix representation, as illustrated with numerical examples.
Thus, the integrate-the-ODE methods have a small but useful role in spectral methods
even today.


15.11 Direct Methods for Separable PDE™s
The earliest work on spectral methods for separable PDE™s is Haidvogel and Zang (1979).
They offer two algorithms for exploiting the separability, one iterative and one direct. We
shall discuss only the latter.
The key to the Haidvogel-Zang algorithm is a procedure which can also be applied to
one-dimensional problems, so we will begin with the ODE case. When

[ p(x) ux ]x ’ q u = f (x) (15.89)

is discretized, it generates the matrix problem


Λx ’ q I a = f (15.90)


where I is the identity matrix. We could absorb the constant q into Λx , of course, but have
split it off for future convenience.
If the collocation matrix Λx is diagonalizable, we can write

’1
˜ (15.91)
Λx = Ex Λx Ex
15.11. DIRECT METHODS FOR SEPARABLE PDE™S 315

˜
where Ex is the matrix whose columns are the n eigenvectors of Λx and where Λx is the
diagonal matrix whose elements are the corresponding eigenvalues. Eq. (15.90) becomes
’1
˜
Ex (Λx ’ q I ) Ex a = f (15.92)
’1
Multiplying through by Ex gives
’1 ’1
˜
(Λx ’ q I ) Ex a = Ex f (15.93)

De¬ne
’1 ’1
b ≡ Ex a g ≡ Ex f (15.94)
;

<<

. 65
( 136 .)



>>