. 3
( 83 .)


determinants (too many arithmetical operations are needed and rounding errors due to the computer
often lead to a completely wrong result). A recursive procedure for computing the ek (Sn )™s without
computing the determinants involved in their deÿnition was needed. This algorithm was obtained in
1956 by Peter Wynn. It is called the -algorithm [147]. It is as follows. One starts with
(n) (n)
= 0; = Sn
’1 0

and then
(n) (n+1)
= + :
k+1 k’1 (n+1) (n)

k k
Note that the numbers k ™s ÿll out a two-dimensional array. The -algorithm is related to Shanks
transformation by
(n) (n)
= ek (Sn ) and = 1=ek ( Sn ):
2k 2k+1

Thus, the ™s with an odd lower index are only auxiliary quantities. They can be eliminated from
the algorithm, thus leading to the so-called cross rule due to Wynn [153].
6 C. Brezinski / Journal of Computational and Applied Mathematics 122 (2000) 1“21

When implementing the -algorithm or using Wynn™s cross rule, division by zero can occur and
the algorithm must be stopped. However, if the singularity is conÿned, a term that will again be used
in Section 1.6, that is if it occurs only for some adjacent values of the indexes k and n, one may
jump over it by using singular rules and continue the computation. If a division by a number close
to zero arises, the algorithm becomes numerically unstable due to the cancellation errors. A similar
situation holds for the other convergence acceleration algorithms. The study of such problems was
initiated by Wynn in 1963 [151], who proposed particular rules for the -algorithm which are more
stable than the usual rules. They were extended by Florent Cordellier in 1979 [51,151]. Particular
rules for the ‚-algorithm were obtained by Redivo Zaglia [155].
The convergence and acceleration properties of the -algorithm have only been completely de-
scribed only for two classes of sequences, namely totally monotonic and totally oscillating sequences
Shanks™ transformation and the -algorithm have close connections to Padà approximants, contin-
ued fractions and formal orthogonal polynomials; see, for example [18].

2.3. Subsequent developments

The Shanks transformation and the -algorithm sparked the rebirth of the study of nonlinear ac-
celeration processes. They now form an independent chapter in numerical analysis with connections
to other important topics such as orthogonal and biorthogonal polynomials, continued fractions, and
Padà approximants. They also have applications to the solution of systems of linear and nonlinear
equations, the computation of the eigenvalues of a matrix, the solution of systems of linear and
nonlinear equations, and many other topics, see [40]. Among other acceleration methods which were
obtained and studied, are the W -process of Samuel Lubkin [89], the method of Kjell J. Overholt
[110], the -algorithm of Wynn [148], the G-transformation of H.L. Gray, T.A. Atchison and G.V.
McWilliams [70], the ‚-algorithm of Claude Brezinski [14], the transformations of Bernard Germain“
Bonne [63] and the various transformations due to David Levin [85]. To my knowledge, the only
known acceleration theorem for the -algorithm was obtained by Naoki Osada [108]. Simultane-
ously, several applications began to appear. For example, the -algorithm provides a quadratically
convergent method for solving systems of nonlinear equations and its does not require the knowl-
edge of any derivative. This procedure was proposed simultaneously by Brezinski [13] and Eckhart
Gekeler [61]. It has important applications to the solution of boundary value problems for ordinary
di erential equations [44]. Many other algorithms are given in the work of Ernst Joachim Weniger
[145], which also contains applications to physics, or in the book of Brezinski and Michela Re-
divo Zaglia [40] where applications to various domains of numerical analysis can be found. The
authors of this book provide FORTRAN subroutines. The book of Annie Cuyt and Luc Wuytack
must also be mentioned [53]. The -algorithm has been applied to statistics, see the work of Alain
Berlinet [9], and to the acceleration of the convergence of sequences of random variables, consid-
ered by HÃ l‚ ne Lavastre [82]. Applications to optimization were proposed by Le Ferrand [84] and
Bouchta Rhanizar [118].
Instead of using a quite complicated algorithm, such as the -algorithm, it can be interesting
to use a simpler one (for instance, Aitken™s 2 process) iteratively. Such a use consists of ap-
plying the algorithm to (Sn ) to produce a new sequence (Tn ), then to apply the same algorithm
to (Tn ), and so on. For example, applying the iterated 2 process to the successive convergents
C. Brezinski / Journal of Computational and Applied Mathematics 122 (2000) 1“21 7

of a periodic continued fraction produces a better acceleration than using the -algorithm [24]. In
particular, the iterated 2 process transforms a logarithmic sequence into a sequence converging
linearly and linear convergence into superlinear, to my knowledge the only known cases of such
The experience gained during these years lead to a deeper understanding of the subject. Research
workers began to study more theoretical and general questions related to the theory of conver-
gence acceleration. The ÿrst attempt was made by R. Pennacchi in 1968 [114], who studied rational
sequence transformations. His work was generalized by Germain“Bonne in 1973 [62], who pro-
posed a very general framework and showed how to construct new algorithms for accelerating some
classes of sequences. However, a ground breaking discovery was made by Jean Paul Delahaye and
Germain“Bonne in 1980 [56]. They proved that if a set of sequences satisÿes a certain property,
called remanence (too technical to be explained here), then a universal algorithm, i.e. one able to
accelerate all sequences of this set, cannot exist. This result shows the limitations of acceleration
methods. Many sets of sequences were proved to be remanent, for example, the sets of monotonic
or logarithmic sequences. Even some subsets of the set of logarithmic sequences are remanent.
Moulay Driss Benchiboun [5] observed that all the sequence transformations found in the literature
could be written as
f(Sn ; : : : ; Sn+k )
Tn =
Df(Sn ; : : : ; Sn+k )
with D2 f ≡ 0, where Df denotes the sum of the partial derivatives of the function f. The reason
for that fact was explained by Brezinski [26], who showed that it is related to the translativity
property of sequence transformations. Hassane Sadok [123] extended these results to the vector case.
Abderrahim Benazzouz [7] proved that quasilinear transformations can be written as the composition
of two projections.
In many transformations, such as Shanks™, the quantities computed are expressed as a ratio of
determinants. This property is related to the existence of a triangular recurrence scheme for their
computation as explained by Brezinski and Guido Walz [46].
Herbert Homeier [74] studied a systematic procedure for constructing sequences transformations.
He considered iterated transformations which are hierarchically consistent, which means that the
kernel of the basic transformation is the lowest one in the hierarchy. The application of the basic
transformation to a sequence which is higher in the hierarchy leads to a new sequence belonging to
a kernel lower in the hierarchy. Homeier wrote several papers on this topics.
Thus, the theory of convergence acceleration methods has progressed impressively. The practical
side was not forgotten and authors obtained a number of special devices for improving their e -
ciency. For example, when a certain sequence is to be accelerated, it is not obvious to know in
advance which method will give the best result unless some properties of the sequence are already
known. Thus, Delahaye [54] proposed using simultaneously several transformations and selecting,
at each step of the procedure, one answer among the answers provided by the various algorithms.
He proved that, under some assumptions, some tests are able to ÿnd automatically the best answer.
The work of Delahaye was extended by Abdelhak Fdil [58,59]. The various answers could also be
combined leading to composite transformations [23]. It is possible, in some cases, to extract a linear
subsequence from the original one and then to accelerate it, for example, by Aitken™s 2 process
[37]. Devices for controlling the error were also constructed [21].
8 C. Brezinski / Journal of Computational and Applied Mathematics 122 (2000) 1“21

When faced with the problem of accelerating the convergence of a given sequence, two approaches
are possible. The ÿrst is to use a known extrapolation procedure and to try to prove that it acceler-
ates the convergence of the given sequence. The second possibility is to construct an extrapolation
procedure especially for that sequence. Convergence tests for sequences and series can be used for
that purpose as explained by Brezinski [25]. This approach was mostly developed by Ana Cristina
Matos [92]. Special extrapolation procedures for sequences such that ∀n; Sn ’S =an Dn , where (Dn ) is
a known sequence and (an ) an unknown one, can also be constructed from the asymptotic properties
of the sequences (an ) and (Dn ). Brezinski and Redivo Zaglia did this in [39].
A.H. Bentbib [10] considered the acceleration of sequences of intervals. Mohammed Senhadji
[129] deÿned and studied the condition number of a sequence transformation.
2.4. The E-algorithm

As we see above, the quantities involved in Shanks transformation are expressed as a ratio of
determinants and the -algorithm allows one to compute them recursively. It is well known that an
interpolation polynomial can be expressed as a ratio of determinants. Thus polynomial extrapolation
also leads to such a ratio and the Neville“Aitken scheme can be used to avoid the computation of
these determinants which leads to the Richardson extrapolation algorithm. A similar situation arises
for many other transformations: in each case, the quantities involved are expressed as a ratio of
special determinants and, in each case, one seeks for a special recursive algorithm for the practical
implementation of the transformation. Thus, there was a real need for a general theory of such
sequence transformations and for a single general recursive algorithm for their implementation. This
work was performed independently between 1973 and 1980 by ÿve di erent people. It is now known
as the E-algorithm.
It seems that the ÿrst appearance of this algorithm is due to Claus Schneider in a paper received
on December 21, 1973 [128]. The quantities S(hi ) being given for i = 0; 1; : : :, Schneider looked
for S (h) = S + a1 g1 (h) + · · · + ak gk (h) satisfying the interpolation conditions S (hi ) = S(hi ) for
i = n; : : : ; n + k, where the gj ™s are given functions of h. Of course, the value of the unknown S
thus obtained will depend on the indexes k and n. Assuming that ∀j; gj (0) = 0, we have S = S (0).
Denoting by n the extrapolation functional on the space of functions f deÿned at the points
h0 ¿ h1 ¿ · · · ¿ 0 and at the point 0 and such that n f = f(0), we have
k S = c0 S(hn ) + · · · + ck S(hn+k )

with c0 + · · · + ck = 1. The interpolation conditions become
n n
k E = 1; and k gj = 0; j = 1; : : : ; k
n n n
with E(h) ≡ 1. Schneider wanted to express the functional in the form =a +b k’1 . He
k k k’1
obtained the two conditions
kE = a + b = 1

n n n+1
k gk =a k’1 gk +b k’1 gk = 0:
The values of a and b follow immediately and we have
[ n+1 gk ] n ’ [ n gk ] n+1
k’1 k’1
k’1 k’1
k= :
n+1 n
[ k’1 gk ] ’ [ k’1 gk ]
C. Brezinski / Journal of Computational and Applied Mathematics 122 (2000) 1“21 9

Thus, the quantities n S can be recursively computed by this scheme. The auxiliary quantities
gj needed in this formula must be computed separately by the same scheme using a di erent
initialization. As we shall see below, this algorithm is just the E-algorithm. In a footnote, Schneider
mentioned that this representation for n was suggested by Borsch“Supan from Johannes Gutenberg
Universitat in Mainz.
In 1976, Gunter Meinardus and G.D. Taylor wrote a paper [97] on best uniform approximation
by functions from span(g1 ; : : : ; gN ) ‚ C[a; b]. They deÿned the linear functionals Lk on C[a; b] by
Lk (f) = ci f(hi );

where a6h1 ¡ h2 ¡ · · · ¡ hN +1 6b and where the coe cients ci , which depend on n and k, are such
that cn ¿ 0; ci = 0 for i = n; : : : ; n + k, sign ci = (’1)i’n and
|ci | = 1;

ci gj (hi ) = 0; j = 1; : : : ; k:

By using Gaussian elimination to solve the system of linear equations
ai gi (hj ) + (’1) j = f(hj ); j = 1; : : : ; k;

Meinardus and Taylor obtained a recursive scheme
Li+1 (gk )Lik’1 (f) ’ Lik’1 (gk )Li+1 (f)
k’1 k’1
Lk (f) =
Li+1 (gk ) ’ Lik’1 (gk )

with L0 (f) = f(hi ); i = n; : : : ; n + k. This is the same scheme as above.
Newton™s formula for computing the interpolation polynomial is well known. It is based on di-
vided di erences. One can try to generalize these formulae to the case of interpolation by a linear
combination of functions from a complete Chebyshev system (a technical concept which insures the
existence and uniqueness of the solution). We seek
Pk (x) = a0 g0 (x) + · · · + ak gk (x);
satisfying the interpolation conditions
Pk (xi ) = f(xi ); i = n; : : : ; n + k;
where the xi ™s are distinct points and the gi ™s given functions. The Pk can be recursively computed
by an algorithm which generalizes the Neville“Aitken scheme for polynomial interpolation. This
algorithm was obtained by Gunter Muhlbach in 1976 [103] from a generalization of the notion
of divided di erences and their recurrence relationship. This algorithm was called the Muhlbach“
Neville“Aitken algorithm, for short the MNA. It is as follows:
(n+1) (n) (n) (n+1)
gk’1; k (x)Pk’1 (x) ’ gk’1; k (x)Pk’1 (x)
Pk (x) = (n+1) (n)
gk’1; k (x) ’ gk’1; k (x)
10 C. Brezinski / Journal of Computational and Applied Mathematics 122 (2000) 1“21

(n) (n)
with P0 (x) = f(x n )g0 (x)=g0 (x n ). The gk; i ™s can be recursively computed by a quite similar relation-
(n+1) (n) (n) (n+1)
gk’1; k (x)gk’1; i (x) ’ gk’1; k (x)gk’1; i (x)
gk; i (x) = (n+1) (n)
gk’1; k (x) ’ gk’1; k (x)
with g0; i (x) = gi (x n )g0 (x)=g0 (x n ) ’ gi (x). If g0 (x) ≡ 1, if it is assumed that ∀i ¿ 0; gi (0) = 0, the
quantities Pk (0) are the same as those obtained by the E-algorithm and the MNA reduces to it.
Let us mention that, in fact, the MNA is closely related to the work of Henri Marie Andoyer
(1862“1929) which goes back to 1906 [2]; see [30] for detailed explanations.
We now come to the work of Tore Havie. We already mentioned Romberg™s method for ac-
celerating the convergence of the trapezoidal rule. The success of this procedure is based on the
existence of the Euler“Maclaurin expansion for the error. This expansion only holds if the function
to be integrated has no singularity in the interval. In the presence of singularities, the expansion of
the error is no longer a series in h2 (the stepsize) but a more complicated one depending on the
singularity. Thus, Romberg™s scheme has to be modiÿed to incorporate the various terms appearing
in the expansion of the error. Several authors worked on this question, treating several types of


. 3
( 83 .)