[158, 241, 169, 106, 107, 285].

To replace the lengthy, jaw-breaking phrase “optimally-truncated

asymptotic series”, Berry and Howls coined a neologism [35, 30] which

is rapidly gaining popularity: “superasymptotic”. A more compelling

reason for new jargon is that the standard de¬nition of asymptoticity

(Def. 1 above) is a statement about powers of , but the error in an

optimally-truncated divergent series is usually an exponential function

of the reciprocal of .

ActaApplFINAL_OP92.tex; 21/08/2000; 16:16; no v.; p.10

11

Exponential Asymptotics

De¬nition 3 (Superasymptotic). An optimally-truncated asymp-

totic series is a “superasymptotic” approximation. The error is typically

O(exp( ’ q / )) where q > 0 is a constant and is the small parameter

of the asymptotic series. The degree N of the highest term retained in

the optimal truncation is proportional to 1/ .

Fig. 1 illustrates the errors in the asymptotic series for the Stieltjes

function (de¬ned in the next section) as a function of N for ¬fteen

di¬erent values of . For each , the error dips to a minimum at N ≈

1 / as the perturbation order N increases. The minimum error for each

N is the “superasymptotic” error.

Also shown is the theoretical prediction that the minimum error

for a given is ( π/ (2 ))1/2 exp(’1 / ) where Noptimum ( ) ∼ 1/ ’ 1.

For this example, both the exponential factor and the proportionality

constant will be derived in Sec. 5.

The de¬nition of “superasymptotic” makes a claim about the expo-

nential dependence of the error which is easily falsi¬ed. Merely by

rede¬ning the perturbation parameter, we could, for example, make

the minimum error be proportional to the exponential of 1/ γ where γ

is arbitrary. Modulo such trivial rescalings, however, the superasymp-

totic error is indeed exponential in 1/ for a wide range of divergent

series [30, 72].

The emerging art of “exponential asymptotics” or “beyond-all-orders”

perturbation theory has made it possible to improve upon optimal trun-

cation of an asymptotic series, and calculate quantities “below the radar

screen”, so to speak, of the superasymptotic approximation. It will not

do to describe these algorithms as the calculation of exponentially small

quantities since the superasymptotic approximation, too, has an accu-

racy which is O(exp( ’ q / ) for some constant q. Consequently, Berry

and Howls coined another term to label schemes that are better than

mere truncation of a power series in :

De¬nition 4. A hyperasymptotic approximation is one that achieves

higher accuracy than a superasymptotic approximation by adding one

or more terms of a second asymptotic series, with di¬erent scaling

assumptions, to the optimal truncation of the original asymptotic expan-

sion [30]. (With another rescaling, this process can be iterated by

adding terms of a third asymptotic series, and so on.)

All of the methods described below are “hyperasymptotic” in this

sense although in the process of understanding them, we shall acquire

a deeper understanding of the mathematical crimes and genius that

underlie asymptotic expansions and the superasymptotic approxima-

tion.

ActaApplFINAL_OP92.tex; 21/08/2000; 16:16; no v.; p.11

12 John P. Boyd

0

10

µ=1 µ=1/2

µ=1/3 µ=1/4 µ=1/5 µ=1/6 µ=1/7 µ=1/8

-1

10

µ=1/9

-2

10

µ=1/10

Errors

-3

10 µ=1/11

µ=1/12

-4

10

µ=1/13

-5

µ=1/14

10

µ=1/15

-6

10

0 5 10 15 20

N (perturbation order)

1 6 11 16 21

1/µ

Figure 1. Solid curves: absolute error in the approximation of the Stieltjes func-

tion up to and including the N-th term. Dashed-and-circles: theoretical error

in the optimally-truncated or “superasymptotic” approximation: ENoptimum ( ) ≈

( π/ (2 ))1/2 exp(’1 / ) versus 1 / . The horizontal axis is perturbative order N for

the actual errors and 1 / for the theoretical error

But when does a series diverge? Since all derivatives of exp(’1/ )

vanish at the origin, this function has only the trivial and useless power

series expansion whose coe¬cients are all zeros:

exp(’q/ ) ∼ 0 + 0 + 0 2

+ ... (2)

for any positive constant q. This observation implies the ¬rst of our

four heuristics about the non-convergence of an “power series.

Proposition 2 (Exponential Reciprocal Rule). If a function f ( )

contains a term which is an exponential function of the reciprocal of ,

then a power series in will not converge to f ( ).

We must use phrase “not converge to” rather than the stronger

“diverge” because of the possibility of a function like

ActaApplFINAL_OP92.tex; 21/08/2000; 16:16; no v.; p.12

13

Exponential Asymptotics

√

h( ) ≡ 1 + + exp(’1/ ) (3)

The power series of h( ) will converge for all | | < 1, but it converges

to a number di¬erent from the true value of h( ) for all except = 0.

Fortunately, this situation “ a convergent series for a function that

contains a term exponentially small in 1/ , and therefore invisible to

the power series “ seems to be rare in applications. (The author would

be interested in learning of exceptions.)

Milton van Dyke, a ¬‚uid dynamicist, o¬ered another useful heuristic

in his slim book on perturbation methods [297]:

Proposition 3 (Principle of Multiple Scales). Divergence should

be expected when the solution depends on two independent length

scales.

We shall illustrate this rule later.

The physicist Freeman Dyson [122] published a note which has been

widely invoked in both quantum ¬eld theory and quantum mechanics

for more than forty years [164, 165, 166], [44, 45, 43]. However, with

appropriate changes of jargon, the argument applies outside the realm

of the quantum, too. Terminological note: a “bound state” is a spatially

localized eigenfunction associated with a discrete, negative eigenvalue

of the stationary Schr¨dinger equation and the “coupling constant”

o

is the perturbation parameter which multiplies the potential energy

perturbation.

Proposition 4 (Dyson Change-of-Sign Argument). If there are

no bound states for negative values of the coupling constant , then a

perturbation series for the bound states will diverge even for > 0.

A simple example is the one-dimensional anharmonic quantum oscil-

lator, whose bound states are the eigenfunctions of the stationary Schroedinger

equation:

ψxx + {E ’ x2 ’ x4 }ψ = 0 (4)

When ≥ 0, Eq.(4) has a countable in¬nity of bound states with pos-

itive eigenvalues E (the energy); each eigenfunction decays exponen-

tially with increasing | x |. However, the quartic perturbation will grow

faster with | x | than the unperturbed potential energy term, which is

quadratic in x. It follows that when is negative, the perturbation will

reverse the sign of the potential energy at x = ±1/(’ )1/2 . Because

of this, the wave equation has no bound states for < 0, that is, no

ActaApplFINAL_OP92.tex; 21/08/2000; 16:16; no v.; p.13

14 John P. Boyd

eigenfunctions which decay exponentially with | x | for all su¬ciently

large | x |.

Consequently, the perturbation series cannot converge to a bound

state for negative , be it ever so small in magnitude, because there is

no bound state to converge to. If this non-convergence is divergence (as

opposed to convergence to an unphysical answer), then the divergence

must occur for all non“zero positive , too, since the domain of conver-

gence of a power series is always | | < ρ for some positive ρ as reviewed

in elementary calculus texts.

This argument is not completely rigorous because the perturbation

series could in principle converge for negative to something other

than a bound state. Nevertheless, the Change-of-Sign Argument has

been reliable in quantum mechanics [164].

Implicit in the very notion of a “small perturbation” is the idea

that the term proportional to is indeed small compared to the rest of

the equation. For the anharmonic oscillator, however, this assumption

always breaks down for | x | > 1/| |1/2 . Similarly, in high Reynolds

number ¬‚uid ¬‚ows, the viscosity is a small perturbation everywhere

except in thin layers next to boundaries, where it brings the velocity

to zero (“no slip” boundary condition) at the wall. This and other

examples suggests our fourth heuristic:

Proposition 5 (Principle of Non-Uniform Smallness). Divergence

should be expected when the perturbation is not small, even for arbi-

trarily small , in some regions of space.

When the perturbation is not small anywhere, of course, it is impos-

sible to apply perturbation theory. When the perturbation is small

uniformly in space, the power series usually has a ¬nite radius of con-

vergence. Asymptotic“but“divergent is the usual spoor of a problem

where the perturbation is small“but“not“everywhere.

We warn that these heuristics are just that, and not theorems. Coun-

terexamples to some are known, and probably can be constructed for

all. In practice, though, these empirical predictors of divergence are

quite useful.

Pure mathematics is the art of the provable, but applied mathemat-

ics is the description of what happens. These heuristics illustrate the

gulf between these realms. The domain of a theorem is bounded by

extremes, even if unlikely. Heuristics are descriptions of what is prob-

able, not the full range of what is possible.

For example, the simplex method of linear programming can con-

verge very slowly because (it can be proven) the algorithm could visit

every one of the millions and millions of vertices that bound the fea-

sible region for a large problem. The reason that Dantzig™s algorithm

ActaApplFINAL_OP92.tex; 21/08/2000; 16:16; no v.; p.14

15

Exponential Asymptotics

has been widely used for half a century is that in practice, the simplex

method ¬nds an acceptable solution after visiting only a tiny fraction

of the vertices.

Similarly, Hotellier proved in 1944 that (in the worst case) the round-

o¬ error in Gaussian elimination could be 4N times machine epsilon

where N is the size of the matrix, implying that a matrix of dimension

larger than 50 is insoluble on a machine with sixteen decimal places

of precision. What happens in practice is that the matrices generated

by applications can usually be solved even when N > 1000 [294]. The

exceptions arise mostly because the underlying problem is genuinely

singular, and not because of the perversities of roundo¬ error.

In a similar spirit, we o¬er not theorems but experience.

4. Optimal Truncation and Superasymptotics for the

Stieltjes Function

The ¬rst illustration is the Stieltjes function, which, with a change of

variable, is the “exponential integral” which is important in radiative

transfer and other branches of science and engineering. This integral-

depending-on-a-parameter is de¬ned by

∞ exp(’t)

S( ) = dt (5)

1+ t

0

The geometric series identity, valid for arbitrary integer N,

N

(’ t)N +1

1

(’ t)j +

= (6)

1 + t j=0 1+ t

allows an exact alternative de¬nition of the Stieltjes function, valid for

any ¬nite N :

N ∞

j