<<

. 43
( 136 .)



>>

fj /φN,x (xj )
(10.28)
= φN,x (xi )
xi ’ xj
j=1,i=j

since the second sum is zero except for contributing a singular term which cancels the
i = j term in the ¬rst sum. (To avoid unpleasant cancellation of singularities, one could
have noted that since the i-th cardinal function has a maximum at x = xi , its derivative
10.6. GENERALIZED FFTS AND MULTIPOLE METHODS 197


x


xM
xm1 xm2


µ0/(x-x M)+ ...
Figure 10.4: The grid point contributions on a typical interval, x ∈ [xm1 , xm2 ], can be re-
placed by one or a few terms of a multipole expansion when the target point x is suf¬ciently
far away, such as any of the grid points (short vertical lines) pointed to by the black arrows.


is zero at that point and one can therefore omit the i-th term in the sum in (10.27) before
performing the differentiation.) Thus, differentiation is a sum of multipole form, too. The
common factor of φN,x (xi ) is computed and stored during the initialization of the program.
The multipole idea, de¬ning the “charges” qj ≡ fj /φN,x (xj ), is to replace a portion of
the sum where j ∈ [m1 , m2 ] by a geometric series in inverse powers of (x ’ xM ) where
M = (m1 + m2 )/2. It is implicitly assumed that this particular series will be used only for
x = xi which are far from the interval [xm1 , xm2 ] (Fig. 10.4). For different ranges of xi , one
must use different multipole series, which makes the bookeeping a bit tricky. Nevertheless,
the rewards are great when N is very large. Without approximation,
’1
m2 m2
xM ’ xj
qj 1
= qj 1+
(x ’ xj ) x ’ xM x ’ xM
j=m1 j=m1
∞ k+1
1
(10.29)
= µk
x ’ xM
k=0

where the “multipole moments” are
m2
k
qj (xM ’ xj ) (10.30)
µk =
j=m1

In practice, the in¬nite series of multipoles (which is exact) must be truncated to just a
handful of multipoles, which requires some careful thought about error tolerances. One
drawback of FMM is that the cost increases roughly linearly with the number of correct
digits.
A number of variations on this “divide-and-conquer” strategy have been devised. Alpert
and Rohklin (1991) have developed a fast (i. e., O(N log2 (N )) algorithm for converting a
Chebyshev series to a Legendre series or vice versa. The heart of the algorithm is that the
elements of the transformation matrix vary smoothly with row and column indices, which
allows one to replace blocks of the usual matrix-vector multiply by expanding the matrix
elements as low order polynomials in the indices and then reversing the order of sum-
mation. Although their scheme does not directly compute derivatives, one can evaluate
CHAPTER 10. PARTIAL SUMMATION, THE FFT AND MMT
198

derivatives using the regular FFT on the Chebyshev series, so their algorithm allows fast
solutions to differential equations using a Legendre basis. Unfortunately, their algorithm
does not generalize to Associated Legendre functions because, except for the special case
of Legendre polynomials, the Associated Legendre-to-Chebyshev matrix is not suf¬ciently
smooth.
However, a variety of other schemes have been invented to transform spherical har-
monics including Dilts (1985), Brown (1985), and Jakob (1993) and Jakob-Chien & Alpert(1997)
(Table 10.4). Orszag (1986) devised an ingenious algorithm which can be applied to almost
any series of functions which satisfy three-term recurrence relations. This includes all or-
thogonal polynomials plus some types of Bessel function series. Driscoll, Healy and Rock-
more (1997) have developed a faster transform that for any set of orthogonal polynomials.
Unfortunately, the FMM/Fast Generalized Transform revolution has been long on in-
genuity and short on applications to spectral methods. The reason is that although these
algorithms have costs which are O(N log2 (N ), the proportionality constant, which is (5/2)
for a Chebyshev FFT, is very large for these algorithms. When N is 10,000, as in stellar
simulations, these algorithms generate vast savings. For the N = 213 resolution of 1996
spherical harmonics forecasting models, the savings relative to the Matrix Multiplication
Transform are so small ” and perhaps nonexistent ” that the FMM and its brethren have
not been used. As N rises in hydrodynamics models, however, such algorithms will be-
come increasingly attractive as alternatives to the MMT for basis sets like spherical har-
monics for which no FFT is known.



10.7 Transforms on an Irregular, Nonequispaced Grid: “Off-
grid” Interpolation
ˆe
In Semi-Lagrangian time-marching schemes (Staniforth and Cot´ , 1991, and Ritchie, 1987,
1988, 1991), one must interpolate velocities and other ¬elds to the departure points of tra-
jectories, which are spaced irregularly with the respect to the ¬xed, evenly spaced Fourier
grid (or the uneven Chebyshev grid). Because the spectral series must be summed at N
unevely spaced, irregularly distributed points, the Fast Fourier Transform cannot be used
even with a Fourier or Chebyshev basis. Worse still, this “off-grid” interpolation must be
performed every time step.
The naive strategy is to simply sum the spectral series term-by-term, but this costs
O(N 2 ) operations versus the O(N log2 (N )) cost of an FFT. However, there is an added
cost: because the grid points shift from time step to time step, the sines and cosines or
Chebyshev polynomials cannot be precomputed during program initialization, but must
be computed afresh at each step. This means that the cost of the MMT is O(N 2 ) with a large
proportionality constant instead of the usual factor of two. Since all other steps in a Cheby-
shev or Fourier semi-Lagrangian code can normally be performed by FFT and other fast
procedures, the “off-grid” interpolation is the rate-determining step, the most expensive
step, in the entire algorithm.
Fortunately, ¬ve strategies for fast off-grid interpolation are known. The ¬rst is to apply
the FMM; ¬rst published by Boyd (1992c), the FMM/off-grid procedures have been sys-
tematically developed by Dutt and Rohklin (1993, 1994). The second strategy is a Cheby-
¨
shev polynomial expansion procedure invented by Suli and Ware (1991, 1992). Anderson
and Dahleh (1996) devised a similar algorithm with Taylor series instead of Chebyshev
series; they are unique by also providing an ef¬cient algorithm for the reverse off-grid
transform (from grid point values at a set of irregularly spaced points to Fourier coef¬-
cients.) Moore, Healy and Rockmore(1993) invented a fast O(N log2 (N )) algorithm for
10.7. OFF-GRID INTERPOLATION 199

discrete Fourier transforms on nonuniform grids. Ware (1998) is a good review with some
original comparisons between different algorithms.




FFT

Figure 10.5: Boyd™s off-grid interpolation scheme, illustrated through a Fourier basis. The
¬rst step is a standard FFT (arrow) from an evenly spaced coarse grid with N points to
a ¬ner grid (bottom). The second step is low order polynomial interpolation to obtain an
approximation at the off-grid point (marked by the large “X”), using a small number of
grid points centered on the target (heavy circles).



The fourth strategy is a two-part procedure: (i) interpolate to a ¬ne grid, typically with
3N points, from the N -point canonical grid and (ii) apply low order polynomial interpola-
tion or Euler summation of the cardinal series on the ¬ne grid. It is clear that polynomial
interpolation, using the 2M + 1 points nearest the target point, is relatively inexpensive
with a cost of O(M 2 N ) operations at most. (We say “at most” because for large M , the
usual Lagrangian formulas can be replaced by more sophisticated algorithms with some
saving.) However, if polynomial interpolation is performed on the original grid, much ac-
curacy is lost: the precision of a hundred-point approximation to derivatives is replaced by
perhaps a seven-point approximation to velocities at the departure points.

On the ¬ne grid, however, the shortest waves in the function being interpolated have a
wavelength equal to 6hf ine , assuming hf ine = h/3. Boyd shows that each increase in M
by one will reduce the error in interpolating the 6hf ine waves by a factor of four; for other
wavelengths, the ratio is 1/ sin2 (π h/Wavelength), which is even larger. By choosing the
proper combination of M and the ratio of hf ine /h, one can recover spectral accuracy at a
cost which is proportional to O(N log2 (N )). The cost of the interpolation to the ¬ne grid of
3N points is three times the cost of an FFT on N points, but this still gives a relatively small
proportionality constant. The programming is also much easier than the Fast Multipole
Method; Lagrangian interpolation and the FFT are the key building blocks, and both are
FORTRAN library software or built-in commands in MATLAB.
CHAPTER 10. PARTIAL SUMMATION, THE FFT AND MMT
200

10.8 Fast Fourier Transform: Practical Matters
The ¬rst point is: It is never necessary to write an FFT routine yourself. Not only are FFTs
available in most software libraries, such as the NAG and IMSL libraries, and as a built-in
command in MATLAB, but also many books give FORTRAN or C codes (Canuto et al.,
1988, Appendix B, Fornberg, 1996, Appendix F, Press et al., 1986). All three books give
special FFTs for (i) real data (ii) cosine transforms and (iii) sine transforms.
The second point is that unfortunately there is no universally accepted convention for
the FFT. Some software de¬nes it as the complex conjugate of what other routines compute;
others begin the summation with zero instead of one, and thus multiply the transform by
exp(2πi/N ), etc. It is very important to carefully read the documentation (or comment
statements) in your chosen library software to understand what it is computing.
For example, most FFTs sum over wavenumbers from k = 1 to N (as in our Eq. (10.19)
above) instead of k = ’N/2 to k = N/2 ’ 1, which is the usual convention in solving
differential equations. The sum over positive wavenumbers is an “aliased” form in the
sense that k = (3/4)N and k = ’N/4 have identical grid point values on a grid of only N
points, and similarly for all wavenumbers k > N/2. To correctly compute derivatives, one
must multiply the coef¬cient of a wavenumber k > N/2 by its “alias” whose wavenumber
is on k ∈ [’N/2, N/2].
There are many formulas for extracting an optimal cosine transform from a general,
complex FFT if a cosine transform is not available. Details are given in the books by Canuto
et al., Fornberg, and Briggs and Henson (1995).


10.9 Summary
• For large N and a basis of Fourier functions or Chebyshev polynomials or rational
Chebyshev functions, the most ef¬cient way to pass from grid points to spectral co-
ef¬cients or vice versa is through the Fast Fourier Transform (FFT)
• If the solution has symmetries, such as periodic data which can be represented by a
cosine series instead of a full Fourier series, these symmetries can be exploited by the
FFT.
• The cost of the FFT is approximately ((5/2) log2 (N ) for all transforms of real-valued
data including cosine, sine, odd cosine, odd sine and Chebyshev transforms if N is
interpreted as the appropriate fraction of points on the periodicity interval which are
actually transformed
• The FFT, usually in several forms including separate, optimized subroutines for the
sine and cosine transform, is available in most software libraries and is also given
in Numerical Recipes (Press et al., 1986, 1992) and in Appendix B of Canuto et al. and
Appendix F of Fornberg (1996).
• Both the grid-point-to-spectral coef¬cient transform (interpolation) and the spectral-
to-grid transform (summation of spectral series) can be performed by a square ma-
trix/vector multiply at a cost of 2N 2 operations for real-valued data; this is the Matrix
Multiplication Transform (MMT).
• For small to moderate N , the Matrix Multiplication Transformation is more ef¬cient
than the FFT for Chebyshev and Fourier calculations.
• The “cross-over” point where the FFT becomes faster than MMT is highly dependent
on both hardware and software and may range from N = 8 to N = 512.
10.9. SUMMARY 201

• When many one-dimensional transforms are computed simultaneously, the cost of
the FFT can be greatly reduced by banishing the indexing to the outermost loops.

• The FFT is restricted to the Fourier basis, Chebyshev polynomials, and basis functions
obtained from these by a change of coordinates.
• The MMT can be applied to any orthogonal polynomial or Fourier basis including
Hermite functions, Laguerre functions and others for which the FFT is inapplicable.

• For basis sets whose elements have de¬nite parity, the MMT can be split into two
transforms of size N/2 to obtain the Parity Matrix Multiplication Transformation
(PMMT), which costs only half as much as the MMT.

• Generalized FFTs, many based on the Fast Multipole Method (FMM), have been de-
veloped for most non-FFT basis sets.

• The generalized FFTs, although faster than MMT or PMMT in the asymptotic limit
N ’ ∞, have such large proportionality constants p in cost estimates of the form
O(pN log2 (N )) that these algorithms have only rarely been used in pseudospectral
calculations.

• Fast algorithms for interpolating to a set of N points which are spaced irregularly
with respect to the standard pseudospectral grid have been developed; these have
a cost of O(N log2 (N )) like the FFT, but with a proportionality constant three or
four times larger than for the FFT. (Needed for semi-Lagrangian time-marching al-
gorithms, among others.)
• De¬nitions of the FFT vary; read your documentation carefully and adjust the output
of the library software to the transform you want to compute.
Chapter 11

Aliasing, Spectral Blocking, &
Blow-Up


“Blowup of an aliased, non-energy-conserving model is God™s way of protecting you from
believing a bad simulation.”
” J. P. Boyd


”If a problem ... for which the solution develops a singularity in ¬nite time ... is simu-
lated with a ¬xed resolution, then a dealiasing procedure is advisable. Moreover, if a fully
turbulent ¬‚ow is simulated with marginal resolution, then dealiasing may also be useful.”
“ Canuto et al.(1988, pg. 123)


11.1 Introduction
On a grid of N points, Fourier components exp(ikx) with |k| > N/2 appear as low wavenum-
bers; the high wavenumber is said to be “aliased” to the low. Phillips (1959) attributed
the blowup of his model of the earth™s general circulation (1956) to an instability speci¬-
cally caused by aliasing. Because of his work, many modern codes employ a “dealiasing”
procedure or special numerical algorithms that conserve the discretized energy so that a
runaway to supersonic winds is impossible.
There are precursors to blow-up. One is the appearance of “2h” waves in physical
space, that is, waves whose wavelength is twice the grid spacing h. This is accompanied
by a tendency for the Fourier coef¬cients to curl up near k = π/h, the “aliasing limit”; this
growth with k, instead of the expected decay, is called “spectral blocking”.
In this chapter, we review all these matters. Although “aliasing instability” and dealias-
ing algorithms are well-known, there is still a major controversy as to whether heroic mea-
sures to control aliasing and/or enforce strict energy conservation are helpful or merely
misleading.

<<

. 43
( 136 .)



>>