(10.28)

= φN,x (xi )

xi ’ xj

j=1,i=j

since the second sum is zero except for contributing a singular term which cancels the

i = j term in the ¬rst sum. (To avoid unpleasant cancellation of singularities, one could

have noted that since the i-th cardinal function has a maximum at x = xi , its derivative

10.6. GENERALIZED FFTS AND MULTIPOLE METHODS 197

x

xM

xm1 xm2

µ0/(x-x M)+ ...

Figure 10.4: The grid point contributions on a typical interval, x ∈ [xm1 , xm2 ], can be re-

placed by one or a few terms of a multipole expansion when the target point x is suf¬ciently

far away, such as any of the grid points (short vertical lines) pointed to by the black arrows.

is zero at that point and one can therefore omit the i-th term in the sum in (10.27) before

performing the differentiation.) Thus, differentiation is a sum of multipole form, too. The

common factor of φN,x (xi ) is computed and stored during the initialization of the program.

The multipole idea, de¬ning the “charges” qj ≡ fj /φN,x (xj ), is to replace a portion of

the sum where j ∈ [m1 , m2 ] by a geometric series in inverse powers of (x ’ xM ) where

M = (m1 + m2 )/2. It is implicitly assumed that this particular series will be used only for

x = xi which are far from the interval [xm1 , xm2 ] (Fig. 10.4). For different ranges of xi , one

must use different multipole series, which makes the bookeeping a bit tricky. Nevertheless,

the rewards are great when N is very large. Without approximation,

’1

m2 m2

xM ’ xj

qj 1

= qj 1+

(x ’ xj ) x ’ xM x ’ xM

j=m1 j=m1

∞ k+1

1

(10.29)

= µk

x ’ xM

k=0

where the “multipole moments” are

m2

k

qj (xM ’ xj ) (10.30)

µk =

j=m1

In practice, the in¬nite series of multipoles (which is exact) must be truncated to just a

handful of multipoles, which requires some careful thought about error tolerances. One

drawback of FMM is that the cost increases roughly linearly with the number of correct

digits.

A number of variations on this “divide-and-conquer” strategy have been devised. Alpert

and Rohklin (1991) have developed a fast (i. e., O(N log2 (N )) algorithm for converting a

Chebyshev series to a Legendre series or vice versa. The heart of the algorithm is that the

elements of the transformation matrix vary smoothly with row and column indices, which

allows one to replace blocks of the usual matrix-vector multiply by expanding the matrix

elements as low order polynomials in the indices and then reversing the order of sum-

mation. Although their scheme does not directly compute derivatives, one can evaluate

CHAPTER 10. PARTIAL SUMMATION, THE FFT AND MMT

198

derivatives using the regular FFT on the Chebyshev series, so their algorithm allows fast

solutions to differential equations using a Legendre basis. Unfortunately, their algorithm

does not generalize to Associated Legendre functions because, except for the special case

of Legendre polynomials, the Associated Legendre-to-Chebyshev matrix is not suf¬ciently

smooth.

However, a variety of other schemes have been invented to transform spherical har-

monics including Dilts (1985), Brown (1985), and Jakob (1993) and Jakob-Chien & Alpert(1997)

(Table 10.4). Orszag (1986) devised an ingenious algorithm which can be applied to almost

any series of functions which satisfy three-term recurrence relations. This includes all or-

thogonal polynomials plus some types of Bessel function series. Driscoll, Healy and Rock-

more (1997) have developed a faster transform that for any set of orthogonal polynomials.

Unfortunately, the FMM/Fast Generalized Transform revolution has been long on in-

genuity and short on applications to spectral methods. The reason is that although these

algorithms have costs which are O(N log2 (N ), the proportionality constant, which is (5/2)

for a Chebyshev FFT, is very large for these algorithms. When N is 10,000, as in stellar

simulations, these algorithms generate vast savings. For the N = 213 resolution of 1996

spherical harmonics forecasting models, the savings relative to the Matrix Multiplication

Transform are so small ” and perhaps nonexistent ” that the FMM and its brethren have

not been used. As N rises in hydrodynamics models, however, such algorithms will be-

come increasingly attractive as alternatives to the MMT for basis sets like spherical har-

monics for which no FFT is known.

10.7 Transforms on an Irregular, Nonequispaced Grid: “Off-

grid” Interpolation

ˆe

In Semi-Lagrangian time-marching schemes (Staniforth and Cot´ , 1991, and Ritchie, 1987,

1988, 1991), one must interpolate velocities and other ¬elds to the departure points of tra-

jectories, which are spaced irregularly with the respect to the ¬xed, evenly spaced Fourier

grid (or the uneven Chebyshev grid). Because the spectral series must be summed at N

unevely spaced, irregularly distributed points, the Fast Fourier Transform cannot be used

even with a Fourier or Chebyshev basis. Worse still, this “off-grid” interpolation must be

performed every time step.

The naive strategy is to simply sum the spectral series term-by-term, but this costs

O(N 2 ) operations versus the O(N log2 (N )) cost of an FFT. However, there is an added

cost: because the grid points shift from time step to time step, the sines and cosines or

Chebyshev polynomials cannot be precomputed during program initialization, but must

be computed afresh at each step. This means that the cost of the MMT is O(N 2 ) with a large

proportionality constant instead of the usual factor of two. Since all other steps in a Cheby-

shev or Fourier semi-Lagrangian code can normally be performed by FFT and other fast

procedures, the “off-grid” interpolation is the rate-determining step, the most expensive

step, in the entire algorithm.

Fortunately, ¬ve strategies for fast off-grid interpolation are known. The ¬rst is to apply

the FMM; ¬rst published by Boyd (1992c), the FMM/off-grid procedures have been sys-

tematically developed by Dutt and Rohklin (1993, 1994). The second strategy is a Cheby-

¨

shev polynomial expansion procedure invented by Suli and Ware (1991, 1992). Anderson

and Dahleh (1996) devised a similar algorithm with Taylor series instead of Chebyshev

series; they are unique by also providing an ef¬cient algorithm for the reverse off-grid

transform (from grid point values at a set of irregularly spaced points to Fourier coef¬-

cients.) Moore, Healy and Rockmore(1993) invented a fast O(N log2 (N )) algorithm for

10.7. OFF-GRID INTERPOLATION 199

discrete Fourier transforms on nonuniform grids. Ware (1998) is a good review with some

original comparisons between different algorithms.

FFT

Figure 10.5: Boyd™s off-grid interpolation scheme, illustrated through a Fourier basis. The

¬rst step is a standard FFT (arrow) from an evenly spaced coarse grid with N points to

a ¬ner grid (bottom). The second step is low order polynomial interpolation to obtain an

approximation at the off-grid point (marked by the large “X”), using a small number of

grid points centered on the target (heavy circles).

The fourth strategy is a two-part procedure: (i) interpolate to a ¬ne grid, typically with

3N points, from the N -point canonical grid and (ii) apply low order polynomial interpola-

tion or Euler summation of the cardinal series on the ¬ne grid. It is clear that polynomial

interpolation, using the 2M + 1 points nearest the target point, is relatively inexpensive

with a cost of O(M 2 N ) operations at most. (We say “at most” because for large M , the

usual Lagrangian formulas can be replaced by more sophisticated algorithms with some

saving.) However, if polynomial interpolation is performed on the original grid, much ac-

curacy is lost: the precision of a hundred-point approximation to derivatives is replaced by

perhaps a seven-point approximation to velocities at the departure points.

On the ¬ne grid, however, the shortest waves in the function being interpolated have a

wavelength equal to 6hf ine , assuming hf ine = h/3. Boyd shows that each increase in M

by one will reduce the error in interpolating the 6hf ine waves by a factor of four; for other

wavelengths, the ratio is 1/ sin2 (π h/Wavelength), which is even larger. By choosing the

proper combination of M and the ratio of hf ine /h, one can recover spectral accuracy at a

cost which is proportional to O(N log2 (N )). The cost of the interpolation to the ¬ne grid of

3N points is three times the cost of an FFT on N points, but this still gives a relatively small

proportionality constant. The programming is also much easier than the Fast Multipole

Method; Lagrangian interpolation and the FFT are the key building blocks, and both are

FORTRAN library software or built-in commands in MATLAB.

CHAPTER 10. PARTIAL SUMMATION, THE FFT AND MMT

200

10.8 Fast Fourier Transform: Practical Matters

The ¬rst point is: It is never necessary to write an FFT routine yourself. Not only are FFTs

available in most software libraries, such as the NAG and IMSL libraries, and as a built-in

command in MATLAB, but also many books give FORTRAN or C codes (Canuto et al.,

1988, Appendix B, Fornberg, 1996, Appendix F, Press et al., 1986). All three books give

special FFTs for (i) real data (ii) cosine transforms and (iii) sine transforms.

The second point is that unfortunately there is no universally accepted convention for

the FFT. Some software de¬nes it as the complex conjugate of what other routines compute;

others begin the summation with zero instead of one, and thus multiply the transform by

exp(2πi/N ), etc. It is very important to carefully read the documentation (or comment

statements) in your chosen library software to understand what it is computing.

For example, most FFTs sum over wavenumbers from k = 1 to N (as in our Eq. (10.19)

above) instead of k = ’N/2 to k = N/2 ’ 1, which is the usual convention in solving

differential equations. The sum over positive wavenumbers is an “aliased” form in the

sense that k = (3/4)N and k = ’N/4 have identical grid point values on a grid of only N

points, and similarly for all wavenumbers k > N/2. To correctly compute derivatives, one

must multiply the coef¬cient of a wavenumber k > N/2 by its “alias” whose wavenumber

is on k ∈ [’N/2, N/2].

There are many formulas for extracting an optimal cosine transform from a general,

complex FFT if a cosine transform is not available. Details are given in the books by Canuto

et al., Fornberg, and Briggs and Henson (1995).

10.9 Summary

• For large N and a basis of Fourier functions or Chebyshev polynomials or rational

Chebyshev functions, the most ef¬cient way to pass from grid points to spectral co-

ef¬cients or vice versa is through the Fast Fourier Transform (FFT)

• If the solution has symmetries, such as periodic data which can be represented by a

cosine series instead of a full Fourier series, these symmetries can be exploited by the

FFT.

• The cost of the FFT is approximately ((5/2) log2 (N ) for all transforms of real-valued

data including cosine, sine, odd cosine, odd sine and Chebyshev transforms if N is

interpreted as the appropriate fraction of points on the periodicity interval which are

actually transformed

• The FFT, usually in several forms including separate, optimized subroutines for the

sine and cosine transform, is available in most software libraries and is also given

in Numerical Recipes (Press et al., 1986, 1992) and in Appendix B of Canuto et al. and

Appendix F of Fornberg (1996).

• Both the grid-point-to-spectral coef¬cient transform (interpolation) and the spectral-

to-grid transform (summation of spectral series) can be performed by a square ma-

trix/vector multiply at a cost of 2N 2 operations for real-valued data; this is the Matrix

Multiplication Transform (MMT).

• For small to moderate N , the Matrix Multiplication Transformation is more ef¬cient

than the FFT for Chebyshev and Fourier calculations.

• The “cross-over” point where the FFT becomes faster than MMT is highly dependent

on both hardware and software and may range from N = 8 to N = 512.

10.9. SUMMARY 201

• When many one-dimensional transforms are computed simultaneously, the cost of

the FFT can be greatly reduced by banishing the indexing to the outermost loops.

• The FFT is restricted to the Fourier basis, Chebyshev polynomials, and basis functions

obtained from these by a change of coordinates.

• The MMT can be applied to any orthogonal polynomial or Fourier basis including

Hermite functions, Laguerre functions and others for which the FFT is inapplicable.

• For basis sets whose elements have de¬nite parity, the MMT can be split into two

transforms of size N/2 to obtain the Parity Matrix Multiplication Transformation

(PMMT), which costs only half as much as the MMT.

• Generalized FFTs, many based on the Fast Multipole Method (FMM), have been de-

veloped for most non-FFT basis sets.

• The generalized FFTs, although faster than MMT or PMMT in the asymptotic limit

N ’ ∞, have such large proportionality constants p in cost estimates of the form

O(pN log2 (N )) that these algorithms have only rarely been used in pseudospectral

calculations.

• Fast algorithms for interpolating to a set of N points which are spaced irregularly

with respect to the standard pseudospectral grid have been developed; these have

a cost of O(N log2 (N )) like the FFT, but with a proportionality constant three or

four times larger than for the FFT. (Needed for semi-Lagrangian time-marching al-

gorithms, among others.)

• De¬nitions of the FFT vary; read your documentation carefully and adjust the output

of the library software to the transform you want to compute.

Chapter 11

Aliasing, Spectral Blocking, &

Blow-Up

“Blowup of an aliased, non-energy-conserving model is God™s way of protecting you from

believing a bad simulation.”

” J. P. Boyd

”If a problem ... for which the solution develops a singularity in ¬nite time ... is simu-

lated with a ¬xed resolution, then a dealiasing procedure is advisable. Moreover, if a fully

turbulent ¬‚ow is simulated with marginal resolution, then dealiasing may also be useful.”

“ Canuto et al.(1988, pg. 123)

11.1 Introduction

On a grid of N points, Fourier components exp(ikx) with |k| > N/2 appear as low wavenum-

bers; the high wavenumber is said to be “aliased” to the low. Phillips (1959) attributed

the blowup of his model of the earth™s general circulation (1956) to an instability speci¬-

cally caused by aliasing. Because of his work, many modern codes employ a “dealiasing”

procedure or special numerical algorithms that conserve the discretized energy so that a

runaway to supersonic winds is impossible.

There are precursors to blow-up. One is the appearance of “2h” waves in physical

space, that is, waves whose wavelength is twice the grid spacing h. This is accompanied

by a tendency for the Fourier coef¬cients to curl up near k = π/h, the “aliasing limit”; this

growth with k, instead of the expected decay, is called “spectral blocking”.

In this chapter, we review all these matters. Although “aliasing instability” and dealias-

ing algorithms are well-known, there is still a major controversy as to whether heroic mea-

sures to control aliasing and/or enforce strict energy conservation are helpful or merely

misleading.