lumped into a common framework as the “method of mean weighted residuals” (MWR).

If the φn (x) already satisfy the boundary conditions, MWR determines the spectral coef¬-

cients an by imposing the (N + 1) conditions

(3.3)

(wi , R[x; a0 , a1 , . . . , aN ]) = 0, i = 0, 1, . . . , N

for some suitable “test functions” wi (x) where the “inner product” is de¬ned by

b

(u, v) ≡ (3.4)

ω(x)dx u(x) v(x)

a

61

CHAPTER 3. GALERKIN & WEIGHTED RESIDUAL METHODS

62

for a given non-negative weight function ω(x) and any two functions u(x) and v(x).

Four choices of test functions are popular.

PSEUDOSPECTRAL alias COLLOCATION alias METHOD OF SELECTED POINTS:

wi (x) ≡ δ(x ’ xi ) (3.5)

where the xi are some suitable set of “interpolation” or“collocation” points [the two terms

are used interchangeably] and where δ(x) is the Dirac delta-function. Computationally,

one would impose the condition that

(3.6)

R(xi ; a0 , a1 , . . . , aN ) = 0, i = 1, . . . , N

for each collocation point.

METHOD OF MOMENTS:

wi (x) ≡ xi , (3.7)

i = 0, 1, . . . , N

The reason for the name is that (xi , f (x)) is said to be, in statistics, the “i-th moment” of

f (x).

This technique has been very popular in engineering. It works well if a small number of

terms ( say N = 2 or 3) is used. Unfortunately, the method of moments is almost invariably

a disaster if N is large. The dif¬culty is that high powers of x become linearly dependent

due to round-off error. For this reason, the powers of x are unsuitable as basis functions.

For example, ¬tting a polynomial to a function will fail when N is greater than 6 if the poly-

nomial is expressed as powers of x. The matrix of this least-squares problem, the infamous

“Hilbert matrix”, is a popular test for matrix software because it maximizes the effect of

round-off! The method of moments generates matrices that are just as ill-conditioned.

The method of moments works well if (i) N is very small or (ii) calculations are per-

formed in exact rational arithmetic using a symbolic manipulation language like Maple or

Mathematica. Otherwise, the best way to avoid disastrous accumulation of roundoff error

is reject the method of moments. Instead, use Chebyshev or Legendre polynomials as the

“test” and basis functions.

LEAST SQUARES:

If H is a linear operator, then

wi (x) ≡ Hφi (x) (3.8)

The reason for the name is that one can show that this minimizes the “norm” of the residual

where the norm is de¬ned by

||R|| ≡ (3.9)

(R, R)

In other words, “least squares” determines the coef¬cients a0 , a1 , . . . , aN in such a way

that the inner product of R(x; a0 , a1 , . . . , aN ) with itself is made as small as possible for the

class of functions de¬ned by an (N + 1)-term spectral series. The mathematical statement

of this is

(R, R) ¤ (Hv ’ f, Hv ’ f ) for all v(x) of the form (3.10)

3.1. MEAN WEIGHTED RESIDUAL METHODS 63

N

for some {dn } (3.11)

v(x) = dn φn (x)

n=0

It is possible to make (R, R) smaller only by using more terms in the series, not by using

different numerical values of the (N + 1) coef¬cients.

If the basis functions have the property of completeness ” i. e., if we can approximate

the exact solution u(x) to arbitrarily high accuracy by using a suf¬ciently large number of

terms in the spectral series, then the “least squares” method is guaranteed to work.

One great advantage of “least-squares” is that it always generates symmetric matrices

even if the operator H lacks the property of “self-adjointness”. This is helpful in three

ways. First, it greatly simpli¬es the theoretical numerical analysis. Second, symmetric

matrix problems can be solved by a variety of special tricks which are faster and use less

storage than the algorithms for non-symmetric matrices. Third, symmetric matrices are

more stable in time-marching schemes.

It is possible to generalize the “least squares” method to nonlinear problems by de¬ning

wi (x) ≡ ‚R/‚ai , (3.12)

i = 0, 1, . . . , N

which includes (3.8) as a special case.

Despite its many virtues, the “least squares” method is relatively uncommon. For

self-adjoint problems ” and this includes all of quantum physics and chemistry, for ex-

ample ” the Galerkin method is simpler and has the same virtues: Minimum norm of

the residual and a symmetric matrix. Nonlinear problems are usually solved by colloca-

tion/pseudospectral methods because “least squares” gives a very unwieldy set of alge-

braic equations.

However, the least squares method is still the safest way of generating approximations

that depend nonlinearly on the unknowns.

Galerkin METHOD:

wi (x) ≡ φi (x) (3.13)

This choice of test function is successful because any well-chosen set of basis functions

φn (x) will have all the properties desired of weighting functions, too, including linear in-

dependence and completeness. (We will take up the properties of basis sets in the next two

sections.)

There are many workable choices of “test functions” wi (x) besides the four described

here. This book, however, will concentrate on the Galerkin method and the pseudospec-

tral/collocation algorithm.

One should also brie¬‚y note that terminology is not too well standardized. In this

book, “collocation” is a synonym for “pseudospectral”. However, the former term is also

used for a class of ¬nite element methods whereas “pseudospectral” is applied only when

collocation is used with a basis of global (Fourier, Chebyshev . . . ) rather than local (tent,

pyramid, splines . . . ) functions.

In addition, some authors such as Birkhoff and Lynch (1984) and Strang and Fix (1973)

use “Rayleigh“Ritz” as a synonym for what we have called “Galerkin” and “Galerkin”

as a synonym for “mean weighted residual”. Finlayson™s terminology is preferable since

“mean weighted residual” is descriptive in a way that a surname is not.

When the test functions differ from the basis functions, the label “Petrov-Galerkin” is

common.

In addition, some authors like Canuto et al. (1988) restrict “Galerkin” to basis sets that

individually satisfy the boundary conditions while using “tau-method” when φi (x) is a

CHAPTER 3. GALERKIN & WEIGHTED RESIDUAL METHODS

64

Chebyshev polynomial. Our reasons for avoiding this usage are partly historical and partly

pragmatic. The historical reason is given in Chapter 21. The pragmatic reason is that for

some problems, such as those with differential equations which are singular at the end-

points, Chebyshev polynomials do satisfy the boundary conditions. Thus, a tau-method

sometimes is a Galerkin method, too. We shall therefore avoid the label “tau“method” and

use “Galerkin” to refer to any method which uses basis functions, either before or after

modi¬cations, as the “test” functions.

3.2 Completeness and Boundary Conditions

Useful sets of basis functions have a number of properties. First, they must be easy to

compute. Trigonometric functions and polynomials both certainly meet this criterion.

A second requirement is completeness. This means that the basis functions must be suf-

¬cient to represent all functions in the class we are interested in with arbitrarily high accu-

racy. A rigorous completeness proof is too complicated to discuss here. However, all the

basis functions in this book ” Fourier series, Chebyshev polynomials, Hermite functions,

spherical harmonics and so on ” do have the property of completeness.

When explicitly imposing boundary conditions, however, it is possible ” indeed, very

useful ” to use a basis set which is incomplete for arbitrary functions. For instance, suppose

the boundary conditions are homogeneous such as:

(3.14)

u(’1) = u(1) = 0

We have two options: (i) impose the constraint on the basis functions by adding two rows

to the matrix problem we solve for the an :

∞

(3.15)

an Tn (±1) = 0

n=0

or (ii) choose basis functions that independently satisfy the boundary conditions.

EXAMPLE: Chebyshev basis functions that vanish at the endpoints

The set of functions de¬ned by

φ2n (x) ≡ T2n ’ T0 ; φ2n+1 (x) ≡ T2n+1 (x) ’ T1 , n≥1 (3.16)

have the property

for all n (3.17)

φn (±1) = 0,

as can be proved by converting the Chebyshev polynomials in (3.16) into the corresponding

cosines and evaluating at θ = 0, π, which correspond to x = 1, ’1.

This basis set omits two degrees of freedom since T0 and T1 (x) are no longer inde-

pendent, but instead appear only in combination with Chebyshev polynomials of higher

degree. However, u(x) is not arbitrary either; it must satisfy the two boundary conditions.

Therefore, the basis functions de¬ned by (3.16) are complete for the most general function

u(x) that satis¬es the boundary conditions (3.14).

Warning: a common programming error is to set up a test problem whose solution does

not have homogeneous boundary conditions. If the basis vanishes at the endpoints and the

target function u(x) does not, then the coef¬cients are meaningless, oscillatory, and rather

large even in the N -th coef¬cient.

This device of taking linear combinations of Chebyshev polynomials (or something

else) is extremely useful with homogeneous boundary conditions. We no longer have to

3.3. INNER PRODUCT & ORTHOGONALITY 65

complicate the programming by using two rows of our matrix equation for boundary con-

ditions; now, all the rows of the matrix come from minimizing the residual R(x; an ).

Another advantage of basis recombination, noted ¬rst by W. Heinrichs, is that the re-

combined basis functions are better conditioned than the Chebyshev polynomials them-

selves. That is to say, accumulated roundoff error is much lower with basis recombination,

especially when solving high order differential equations. A full explanation is deferred

until a later chapter. The short explanation is that high order derivatives of Chebyshev

(and Legendre) polynomials grow rapidly as the endpoints are approached. The mismatch

between the large values of the derivatives near x = ±1 and the small values near the

origin can lead to poorly conditioned matrices and accumulation of roundoff error. Basis

recombination greatly reduces this because the basis functions are tending rapidly to zero

near x = ±1, precisely where the polynomials (and their derivatives) are oscillating most

wildly. (However, the ill-conditioning of an uncombined basis is usually not a problem

unless N ∼ 100 or the differential equation has third or higher derivatives.)

3.3 Properties of Basis Functions: Inner Product & Orthog-

onality

De¬nition 12 (INNER PRODUCT) Let f (x) and g(x) be arbitrary functions. Then the inner

product of f (x) with g(x) with respect to the weight function ω(x) on the interval [a, b] is de¬ned

by

b

(f, g) ≡ (3.18)

f (x)g(x)ω(x)dx

a

De¬nition 13 (ORTHOGONALITY) A set of basis functions φn (x) is said to be ORTHOGO-

NAL with respect to a given inner product if

2

(3.19)

(φm , φn ) = δmn νn

where δmn is the “Kronecker delta function” de¬ned by

±

1, m=n

[Kronecker δ ’ function] (3.20)

δmn =

0, m=n

and where the constants νn are called “normalization constants”.

The great virtue of orthogonality is that it gives a procedure for computing the spectral

coef¬cients. Let the expansion be

∞

(3.21)

f (x) = an φn (x)

n=0

The inner product of both sides with the m-th basis function is

∞

(3.22)

(f, φm ) = an (φm , φn )

n=0

However, if (and only if) the basis functions are orthogonal, all of the inner products are

zero except the single term n = m. This proves the following.

CHAPTER 3. GALERKIN & WEIGHTED RESIDUAL METHODS

66

Theorem 11 (INNER PRODUCT for SPECTRAL COEFFICIENTS) If a function f (x) is ex-

panded as a series of orthogonal functions,

N

f (x) ≡ (3.23)

an φn (x)

n=0

then for arbitrary N ,

∀n (3.24)

an = (φn , f )/(φn , φn )

It is sometimes convenient to eliminate the “normalization constants” νn ≡ (φn , φn )

by rescaling the basis functions by de¬ning

φN (x) ≡ φn (x)/νn (3.25)

De¬nition 14 (ORTHONORMAL) A basis set is said to be ORTHONORMAL if the basis func-