and simulation. This highlights the problem of constructing efficient statis
tical tests aimed to detecting possible deviations from randomness. Thus
the US National Institute of Standards and Technology (NIST) has car
ried out an investigation of known statistical tests for random and pseudo
random numbers. The results and recommendations for practical usage
were published in [Rukhin et al. (2001)l. Particularly, 16 methods were
recommended for use in cryptography. It is important to note that these
tests were selected as a result of a comprehensive theoretical and experi
mental analysis and may be considered as the stateoftheart in random
ness testing. Recently, in a series of works [Ryabko and Pestunov (2004);
Ryabko et al. (2004); Ryabko and Monarev (2004)l a number of new tests
were suggested whose performance occurred to be significantly greater than
that of the NIST tests. In this section, we describe one of these new meth
ods.
In a statistical test for randomness, we consider the main hypothesis
HOthat a bit sequence is generated by the memoryless source with equal
probabilities of zeroes and ones. Associated with this null hypothesis is the
alternative hypothesis H I that the sequence is generated by a stationary
ergodic source which differs from the source under Ho, H I = 1HO.
Let™s describe the test suggested in [Ryabko and Pestunov (2004)].
In this test, the input sequence x 1 , x 2 , . . . , x , is divided into subwords
x 1 , x 2 , . . . , x , , x s + l , x , + 2 , . . . , x z S , . . . , s 2 1, and the hypothesis H,* that
the subwords obey the uniform distribution (i.e. each subword is generated
with the probability Y5)is tested against H,* = H,*. It is clear that H,*
is true if Ho is true. The key idea of the test is as follows. All subwords
from the set { O , l } s are ordered and this order changes after processing
each subword x j s + l , x j s + 2 , .. . , ˜ j , + ˜j, = 0,1,. . . , in such a way that,
loosely speaking, the more frequent subwords have small ordinals. Then
the frequencies of different ordinals are estimated (instead of frequencies
of the subwords as, say, for chisquare test [Kendall and Stuart (1961);
Basics of Contemporary Cryptography for I T Practitioners
176
Knuth (1981)l). The test is based on construction of adaptive code sug
gested in [Ryabko (1980)] and called book stuck.
We shall consider the input sequence as generated by a source over the
alphabet A = { u l ,. , , , u s } (in practice, S = 2" but it is not necessary).
Suppose we are given a sample u1, u2, . . . ,u generated by the source.
,
In the book stack test, all letters from A are ordered from 1 to S and
r
this order is changed after observing each letter ut according to the formula
= a,
if ut
+
vt(u) I , if vt(u)< vt(ut), (9.2)
vt+l(u) =
if vt(u)> vt(ut)
vt(u),
where vt is the order after observing u1,u2,. . . ,ut, t = 1,.. . ,n, v 1 being
defined arbitrarily. (For example, we can define v1 = (al, . . . ,as).)
Let us explain informally (9.2). Suppose that the letters from A are
arranged in a stack, like a stack of books, and d ( u ) is a position of a in the
stack. Let the first letter u1 of the word ul, up,. . . ,u, be a. If it occupies
the 21th position in the stack (vl(u) il), then extract a out of the stack
=
and push it to the top. (It means that the order is changed according
to (9.2).) Repeat the procedure with the second letter u2 and the stack
obtained, etc.
It can help to understand the main idea of the suggested method if
we take into account that, if H1 is true, the frequent letters from A (as
frequently used books) will have relatively small ordinals (will spend more
time near the top of the stack). On the other hand, if HO is true, the
probability to find each letter ui at each position j is equal to 1/S.
Let's continue the description of the test. The set of all indexes
(1,. . . , S } is divided into T , T 2 2, subsets A1 = {1,2,. . . ,k1}, A2 =
{ k l + l , . . . , kp}, . . . , A, = {krl+l, . . . , k.. Then, using u1,up, . . . ,un, we
,}
calculate how many v t ( z t ) , = 1,. . . ,n, belong to a subset A k , k = 1,. . . ,T .
t
We denote this number by nk. More formally,
Obviously, if H,* is true, the probability of the event vt((.t) E Ak is equal
to [ A j [ / S Then, using a usual chisquare test, we test the hypothesis
.
B = P { d ( X t )E Ak} = ( A j [ / S
o
being based on the empirical frequencies 721,. . . ,n,, against H I = ˜ H o .
177
Random Numbers i Cryptography
n
Let us recall that the value
(9.3)
is computed, when the chisquare test is applied, see, e.g. [Kendall and
Stuart (1961); Knuth (198l)l. It is known that z2 asymptotically follows
(XE˜)
the chisquare distribution with (k  1) degrees of freedom if f i 0 is
true. If the level of significance (or a Type I error) of the chisquare test
is a,0 < a < 1, the hypothesis HO is accepted when x2 from (9.3) is less
than the (1  a)value of the & distribution.
We do not describe the exact rule for constructing the subsets Al, . . . ,
A,., but recommend to implement some experiments for finding the param
eters which make the sample size minimal (or, at least, acceptable). The
point is that there are many cryptographic applications where it is possible
to implement some experiments for optimising the parameter values and,
then, to test the hypothesis based on independent data. For example, in
case of testing a pseudorandom number generator it is possible to seek
suitable parameters using a part of generated sequence and then to test the
generator using a new part of the sequence.
Let us consider an example.
Example 9 3 Let
.
A = {al,.. ,a6}, 211.. . = a3a6a3a3a6a1a6a1,
T = 2, A1 = {1,2,3}, A2 = {4,5,6},
v1 = (a12a27 a37 a4,'35, a6) *
Then
nl=O,nz=O;
nl= 1 ;
122 = 1 ;
n1 = 2 ;
n1=3;
n1 = 4 ;
n1=5;
n =6;
l
nl=7.
We can see that the letters a3 and a6 axe quite frequent and the book stack
test indicates this nonuniformity quite well. Indeed, the average values of
Basics of Contemporary Cryptoyaphy for IT Practitioners
178
and are equal to 4, whereas the real values are 7 and 1, respectively.
n1 712
Let us make two notes here. First, we pay attention to the complexity
of this algorithm. The “naive” method of transformation according to
(9.2) would take the number of operations proportional to S , but there
exist algorithms which can perform all operations in (9.2) in O(1og S ) time.
Such algorithms can be based on AVL trees, see, e.g. [Aho et al. (1976)I.
The second comment concerns with the name of the method. The book
stack structure is quite popular in information theory and computer science.
In information theory, this structure was firstly suggested as a basis of a
universal code in [Ryabko (1980)l and then rediscovered in [Bently et al.
(1986)]. In the literature this code is frequently called “movetofront”
(MTF) scheme as it w s suggested in [Bently et al. (1986)I.
a
As we have already noted, in [Ryabko and Pestunov (2004); Ryabko and
Monarev (2004)], the book stack test was compared to the NIST tests. The
power of book stack test was shown to be significantly greater than that of
all 16 tests recommended by NIST; see the cited papers for details.
Statistical Attack to Block Ciphers
9.5
Cryptanalysis of block ciphers attracts much research, and new results in
the field are always beneficial for improving constructions of the ciphers.
Sometimes the complexity of a new attack (measured in the number of
memory units and operations required for mounting such an attack) might
be quite large. Nevertheless, even if a relatively small decrease in the attack
complexity is achieved, in comparison with previously known methods, this
can motivate further development of the cipher design. Thus linear crypt
analysis of DES (see [Menezes et al. (1996)l) requires 243 known plaintext
ciphertext pairs and is generally considered infeasible in practice. But it
has made an important impact on the design principles of modern block
ciphers, that are now resistant to this kind of attack. In this section, we
described a new attack to block ciphers, referred to as a “gradient statis
tical attack”. This attack connects such different problems as randomness
testing and block cipher cryptanalysis.
Consider a block cipher with the blocklength n, key length s, and en
cryption function E ( z ,K ) , where z ‚¬ (0,l)” denotes a plaintext block,
and K E (0,l)” a secret key (we need to change the notation slightly as
compared to Sec. 8.2). The typical values of n and s for modern block
ciphers are n = 64 or n = 128, s = 128 bits (see Sec. 8.2). The majority
R a n d o m Numbers in Cryptography 179
of block ciphers are iterated, i.e. involve many rounds of transformations
usually bracketed by some prologue and epilogue. Either of these, in turn,
can sometimes be divided into a number of more simple steps. In respect
of this iterated structure, the secret key K is expanded into a sequence
.
of subkeys or round keys k l , kz, . . ,kt, where t is the number of “simple
steps” in a block cipher. Denote by 20 the initial state of block z, and by xi
the state after the ith step. So the complete encryption is zt = E(zo,K )
and this can be written a s
where Ei denotes the encrypting transformation at the ith step.
Example 9.4 Consider the cipher RC5 (see p. 145) with the blocklength
64 and number of rounds r. The encrypting process, with reference to (9.4),
is as follows:
input: ( a , b )
prologue:
a+a+kl
b+b+k2
round 1:
+
a + ( ( a@ b) w b) k3
+ k4
b + ( ( b CBa ) + a )
round r:
+ br+1
a + ( ( a@ b) t˜ b)
+ hr+2
b + ((b@ a ) +a ) = Et(Xt1, k t )
Zt
( a , b)
output: xt = ( a , b )
Basics of Contemporary Cryptography for IT Practitioners
180
must be secure against them. We consider a chosenplaintext attack for
the cipher which can be represented by (9.4) with relatively small subkeys.
Denote the lengths of the secret key and each subkey by lKl and Ikl, re
spectively. The exhaustive keysearch requires 0(2IKI) operations (decrypt
with K = 0,1,. . . until a known 2 is obtained). Meanwhile, the described
attack requires O(mt21™1) operations, where m is the number of ciphertext
blocks sufficient for statistical analysis. The attack succeeds in finding the
correct subkeys (instead of K itself), provided the statistical test is able to
detect deviations from randomness in a sequence of m blocks. It is essential
to use new efficient statistical tests like those presented in Sec. 9.4. As an
example, we shall consider experimental implementation of the attack with
respect to RC5 block cipher.
As we consider the block ciphers that can be described by (9.4), notice
that the sequence corresponding to (9.4) for decryption is
 , h)
= Dt(% k t ) 7 (9.5)
20 = Dl(Z1,
Zt1 ** 7
where Di denotes the decrypting transformation inverse to Ei.
One of the requirements to a block cipher is that, given a sequence of
different blocks as input, the cipher must output a sequence of bits that
looks like completely random (see CTR block cipher mode on p. 163). We
shall loosely call bit sequences “more random” or “less random” depending
on how much they differ from a completely random sequence. One way
to measure randomness is to use some statistic on the sequence with the
property that less random sequences have greater statistics (to within some
probability of error in decision). This may be a wellknown z2 statistic
subjected to x2 distribution. Denote such a statistic by y(z), where 2 is a
bit sequence.
Denote by a1, a2, . . . , am a sequence of input blocks. Let all the blocks
be nonrandom and painvise different. A possible example would be a = 1, 1
a = 2, . . . , a, = m,where the numbers are written as nbit words. For
2
a good block cipher, the encrypted sequence

E(a1,K ) , E(a2,K ) E(am,K )
* 9