. 66
( 136 .)


Eq. (15.93) becomes the trivial problem

(»i ’ q) bi = gi (15.95)
i = 0, . . . , N + 1

After solving (15.95), we obtain a from b by multiplying the latter by E.
Once the eigenvectors have been computed (in O[N 3 ] operations), each solution for
new values of q and/or f costs only O(6N 2 ) operations, that is, three multiplications of a
column vector by a square matrix. In the semi-implicit time-stepping algorithm of Orszag
and Patera (1982), this tactic is applied in the radial direction. It minimizes storage over
the more obvious technique of Gaussian elimination because the radial boundary value
problems are identical for different polar and axial wavenumbers except for a different
value of the constant q. Thus, they only need to store two matrices, Ex and its inverse,
whereas with Gaussian elimination it would be necessary to store a matrix of the same size
for each pair of polar and axial wavenumbers (several hundred in all!).
This trick is also a direct method of attacking two-dimensional separable problems be-
cause the diagonalization still works even if q is a differential operator, not a constant, so long
as the operator involves only the other coordinate, y. If this condition is met, then Ex and
»x will be independent of y. For example, if

q ≡ (s(y) ‚y ) ‚y + t(y), (15.96)

then if we discretize only the x-derivatives, (15.95) becomes a set of (N + 1) uncoupled
ordinary differential equations in y:

[s(y) bi,y ]y + [t(y) ’ »i ] bi (y) = ’gi (y) (15.97)

This is precisely what is generated by the analytical method of separation of variables,
applied to the original PDE. The condition that the discretized x-derivative matrix can
be diagonalized independent of y, which demands that the operator q involves only y
derivatives and functions of y, is just the condition that the original problem is separable.
In fact, the Haidvogel-Zang direct method is separation-of-variables.
An obvious question is: Why do we need this post-collocation variant of separation
of variables? The answer is that unless the boundary conditions are spatial periodicity, the
separation-of-variables eigenfunction series will have a slow, algebraic rate of convergence.
To obtain high accuracy, we need Chebyshev polynomials in x.4
4 Infairness, sometimes Chebyshev series converge as slowly as eigenseries for time-dependent problems when
the initial conditions fail to satisfy suf¬cient “compatibility conditions” at the boundaries (Boyd and Flyer, 1998).

One way of looking at the Haidvogel-Zang procedure is that it is a method of solving
the set of (N + 1) coupled ordinary differential equations for the ai by making a “smart”
change-of-basis [in the space of (N + 1)-dimensional column vectors] to a new set of coef-
¬cients, the bi , such that the differential equations uncouple.
Solving N ordinary differential equations is still moderately expensive, but for Pois-
son™s equation (or any which is constant coef¬cient in y), Haidvogel and Zang note that
Clenshaw™s method (previous section!) applies at a cost of N operations per ODE in y.
Thus, after an O(N 3 ) preprocessing step, we can solve a separable PDE for different f (x, y)
at a cost of O(N 2 ) per solution. The ef¬ciency of the Haidvogel and Zang procedure can be
extended to equations with y-dependent coef¬cients by using the preconditioned Richard-
son™s iteration as described in earlier sections. (Note that variable x-coef¬cients are allowed
in either case.)
The preprocessing step, which is a one-dimensional eigenproblem, costs O(N 3 ) opera-
tions whereas fast direct ¬nite difference methods require only O(N 2 log2 N ) operations
where the tensor product grid has a total of N 2 points. However, on a 1997 personal
computer, the cost of solving an N = 100 one-dimensional eigenproblem was only half
a second! The preprocessing step was expensive in 1979, but not today.
If the separable elliptic problem is part of a semi-implicit time-marching algorithm,
then the cost of the O(N 3 ) preprocessing step for the spectral method is amortized over
thousands of time steps. The actual cost per time step is merely that for a single backsolve,
which is O(N 2 ) ” directly proportional to the number of grid points ” for both spectral
and ¬nite difference methods.
Haldewang, Labrosse, Abboudi and Deville (1984) compared three algorithms for the
Helmholtz equation. The Haidvogel-Zang strategy is to diagonalize in two dimensions
and solve tridiagonal systems in the other, and proved the fastest. However, Haldewang
et al. found moderately severe roundoff problems with this method, losing ¬ve decimal
digits for N as small as 64.
Full diagonalization (in all three dimensions) was slightly slower but easier to program.
This algorithm became very popular in France and perhaps deserves a wider range of
The third algorithm was a ¬nite difference-preconditioned iteration. This is cheap for
separable problems because the preconditioning can be done using the special “fast direct
methods” for separable problems, which cost only O(N 3 log2 (N )) operations on an N 3 grid
in three dimensions.6 The preconditioned iteration was an order of magnitude slower than
the Haidvogel-Zang procedure at all N , but the fast direct ¬nite difference solver can be
used as a preconditioner for nonseparable problems. The iteration loses little accuracy to
round-off error even at very large N .
Liffman(1996) extended the Haidvogel-Zang eigenanalysis to incorporate very general
(Robin) boundary conditions. This has been widely used by the Nice group under the
name of “matrix diagonalization” since Ehrenstein and Peyret (1989).
Patera (1986) combined the Haidvogel-Zang algorithm with static condensation to cre-
ate a spectral element Poisson-solver with an O(N 5/2 ) operation count, but this algorithm
has largely been replaced by multigrid.
Siyyam and Syam(1997) offer a Chebyshev-tau alternative to the Haidvogel-Zang tech-
nique. Their experiments show that their method is faster, at least for large N , but their
algorithm has not been extended to PDEs with spatially-varying coef¬cients.
5 Because there are often several algorithms of roughly equal effectiveness for a given problem, numerical
analysis is prone to national and individual fashions that are determined more by accidents of education and
exposure than by rigorous mathematical justi¬cation.
6 The FISHPACK library, written by John Adams and available through the National Center for Atmospheric

Research, is an excellent collection of fast direct methods.

Ierley(1997) has developed a similar fast algorithm for the Laplace operator in an arbi-
trary number of space dimensions. One key ingredient is to use Jacobi polynomials rather
than Chebyshev polynomials.
J. Shen (1994a,1995b) and Lopez and Shen(1998) developed Legendre-Galerkin schemes
which cost O(N 3 ) operations for second and fourth order constant coef¬cient elliptic equa-
tions. With the basis φj (x) = Lj+2 (x) ’ Lj (x), the weak form of the second derivative
is a diagonal matrix. He developed similar Chebyshev algorithms in Shen(1995a). Both
his Chebyshev and Legendre forms have condition numbers of O(N 4 ) for fourth order
problems, versus O(N 8 ) for some other spectral algorithms, so that it is possible to solve
the biharmonic equation with O(500) unknowns in each coordinate to high accuracy. His
Chebyshev algorithm costs O(N 4 ) for fourth order problems, but Bjørstad and Tjøstheim
(1997) developed an O(N 3 ) modi¬cation, and made a number of further improvements in
both Legendre and Chebyshev algorithms for fourth order problems.
Braverman, Israeli, Averbuch and Vozovoi (1998) developed a Poisson solver in three
dimensions which employs a Fourier basis even though the boundary conditions are non-
periodic. A clever regularization method restores an exponential rate of convergence,
which would normally be lost by non-periodic application of a Fourier series.

15.12 Fast Iterations for Almost Separable PDEs

D™yakonov(1961) and Concus and Golub (1973) noted that non-separable elliptic equations
can be solved very ef¬ciently by using an iteration in which a separable differential equa-
tion is solved at each iteration. (Obviously, the coef¬cients of the separable BVP are chosen
to approximate those of the nonseparable problem as closely as possible.)
Several groups have developed ef¬cient spectral algorithms for nonseparable elliptic
equations using a D™yakonov/Concus-Golub iteration. Guillard and Desideri(1990) com-
bined the Minimum Residual Richardson™s iteration with two different spectral precondi-
tioners that both solved separable problems. The more ef¬cient preconditioner required
computing eigenvalues and eigenvectors in a preprocessing step as in the Haidvogel and
Zang procedure. Strain(1994) solved non-separable periodic problems with a Fourier ba-
sis where the solution of the separable differential equation is almost trivial. Zhao and
Yedlin(1994) showed that the ef¬ciency of solve-a-separable-PDE iterations is as good as
multigrid, at least sometimes. Dimitropoulus and Beris (1997) found that the combination
of a Concus-Golub iteration with a fast spectral Helmholtz solver did not always converge
for problems with strongly varying coef¬cients. However, when this was embedded as an
inner iteration with the biconjugate gradient method, BiCGstab(m), as the outer iteration,
the result was a very robust and ef¬cient algorithm. Typically only two Concus-Golub it-
erations were needed between each outer iteration; the biconjugate gradient method never
needed more than nine iterations.
The title of this section, “almost separable”, is a little misleading. Although these iter-
ations converge most rapidly if the PDE is only a slight perturbation of a separable prob-
lem, the examples of Zhao and Yedlin, Strain, and Dimitropoulus and Beris show that
often the D™yakonov/Concus-Golub algorithm converges very rapidly even when the BVP
only vaguely resembles a separable problem. This is especially true when the iteration is
combined with other convergence-accelerating strategies such as the biconjugate gradient
method used as an outer iteration by Dimitropoulus and Beris.

15.13 Positive De¬nite and Inde¬nite Matrices
As noted in Sec. 2, it is very important that the pseudospectral matrix be positive de¬nite
because most iterations are analogous to time-integration of the diffusion equation

ut = ’Λ u + f (15.98)

where Λ is the square matrix which is the discretization of the differential operator. If
Λ has only positive eigenvalues, then the solution of (15.98) will decay monotonically to
the desired steady-state which solves Λu = f . However, if Λ is inde¬nite, that is, has
eigenvalues of both signs, then the negative eigenvalue modes will grow exponentially
with time and the iteration will diverge.
This would seem to be a very serious restriction because even so simple a problem as
the one-dimensional Helmholtz equation

’uxx ’ q u = f (x) (15.99)
u(0) = u(π) = 0

with periodic boundary conditions has the eigenvalues

» n = n2 ’ q (15.100)
n = 1, 2, . . .

The lowest eigenvalue is negative for q > 1, two eigenvalues are negative for q > 4, three
for q > 9, and so on. Richardson™s iteration would diverge.
Happily, both preconditioning and multigrid usually solve the problem of a ¬nite num-
ber of small eigenvalues of the wrong sign. The reason that preconditioning is effective
is that the offending modes are the ones with the simplest spatial structure. If q = 8,
for example, only the modes sin(x) and sin(2x) are unstable. However, these modes are
precisely the ones which will be accurately resolved by both the ¬nite difference and pseu-
dospectral approximations. Consequently, sin(x) and sin(2x) (or more precisely, the col-
umn vectors containing the grid point values of these functions) are both eigenvectors of
the pre-conditioned matrix H Λ with eigenvalues approximately equal to 1.
There is a snare: when q is very close to n2 , one eigenvalue of Λ may be very, very close
to zero ” smaller in magnitude than the small error in the ¬nite difference approximation
of this same mode. Canuto et al. (1988, pg. 143) show that the pre-conditioned eigenvalue
n2 ’ q
»(p) (15.101)
x/2)2 ’ q
n2 sin2 (n x/2)/(n

For ¬xed q & n, »n (p) ’ 1 in the limit x ’ 0. However, for ¬xed x, there is always
a range of q suf¬ciently close to n2 so that »n is negative and Richardson™s iteration will
then diverge for any time step.
Since the grid spacing x always is small, one has to be rather unlucky for a pre-
conditioned iteration to diverge, but clearly, it can happen. It is encouraging, however,
that preconditioning will remove the problem of matrix inde¬niteness most of the time.
Multigrid is intrinsically just an ef¬cient way of performing Richardson™s iteration. It
will therefore diverge due to matrix inde¬niteness whenever Richardson™s would, too.
However, a common practice in multigrid ” even when the matrix is positive de¬nite ” is
to apply Gaussian elimination instead of iteration on the coarsest grid. The obvious reason
is that when the coarse grid contains only nine points, Gaussian elimination is ridiculously

cheap. The subtle, empirical reason is that using direct methods on the coarsest grid seems
to give additional stability and improved convergence rate to the whole iteration.
When the matrix is inde¬nite but has only one or two negative eigenvalues, this practice
of Gaussian elimination on the coarsest grid is a life-saver. Although the coarse grid cannot
resolve the high modes of the operator, it can resolve ” at least crudely ” the lowest one
or two modes which (otherwise) would make the iteration diverge.
A little care is needed. If there are several unstable modes, one needs a coarsest grid
which has at least a moderate number of points.
There is, alas, a small class of problems for which neither multigrid nor precondition-
ing will help: those with an in¬nite number of eigenvalues of negative sign. An example is
Laplace™s Tidal equation for waves in a semi-in¬nite atmosphere (Chapman and Lindzen,
1970). The tidal equation is very messy because it includes complicated trigonometric
terms due to the earth™s sphericity. However,

uyy + (ω 2 ’ y 2 ) uzz = f (y, z) (15.102)

u(y, z) bounded as z ’ ∞ (15.103)
u(±L, z) = 0 & u(y, 0) = 0 &

displays the same qualitative behavior.
Separation-of-variables gives the latitudinal eigenequation

uyy + » (ω 2 ’ y 2 ) u = 0, (15.104)

the parabolic cylinder equation. If the boundary L > ω, then (15.104) has an in¬nite num-
ber of solutions with positive eigenvalues. The eigenmodes oscillate on y ∈ [’ω, ω] and
decay exponentially for larger y; they may be accurately approximated by Hermite func-
tions (as is also true of the corresponding tidal modes). The vertical structure of these
modes is oscillatory.
There is also an in¬nite number of modes with negative eigenvalues. These oscillate on
the intervals y ∈ [’L, ’ω] & y ∈ [ω, L] and are exponentially small at y = 0. These modes
decay exponentially as z ’ ∞.
Eq. (15.102) is rather peculiar: an equation of mixed elliptic-hyperbolic type. The bound-
ary conditions are also a little peculiar, too, in that either oscillatory or exponential behavior
is allowed as z ’ ∞. Nevertheless, the linear wave modes of a rotating, semi-in¬nite at-
mosphere are described by a mixed elliptic-hyperbolic equation similar to (15.102), and the
modes really do fall into two in¬nite classes: one with » > 0 and the other with » < 0.
No iterative method has ever been successfully applied to these problems. Lindzen (1970),
Schoeberl & Geller (1977) and Forbes & Garrett (1976) have all obtained good results by
combining ¬nite differences with banded Gaussian elimination.
Navarra(1987) combined ¬nite differences in z with spherical harmonics in latitude. By
applying Arnoldi™s matrix-solving iteration, Navarra solved mixed-type boundary value
problems with as many as 13,000 unknowns.
In principle, inde¬niteness may be cured by squaring the matrix, i. e.
Λ Λu = Λ f
where superscript “T” denotes the matrix transpose. The disadvantage of matrix-squaring
is that if the condition number of Λ is O(N 2 ), then the condition number of the positive
de¬nite matrix Λ Λ is O(N 4 ).
However, we have already seen that preconditioning is strongly recommended even
for the unsquared matrix. For Λ and Λ Λ, ¬nite difference preconditioning will give a

condition number for the squared matrix which is O(1), independent of N . This tactic of
squaring the matrix should work even for the dif¬cult case of the Laplace Tidal equation,
provided that none of its eigenvalues are too close to zero7 .
What if Λ has complex eigenvalues? First, the transpose in (15.105) should be interpreted
as the Hermitian adjoint. Second, the Chebyshev acceleration of Richardson™s iteration (¬rst
edition of this book) should be avoided unless the imaginary parts are small. The Richard-
son™s iteration, with or without preconditioning, will still converge rapidly as long as the
real parts of the eigenvalues are all positive, and better still, so will its Minimum Residual


. 66
( 136 .)