dent bits. Finally, we have shown how pseudorandom sequences are used
in stream ciphers to encrypt messages. It is natural then that the problem
of generating random and pseudorandom numbers is of great interest for
the designers of cryptosystems. Moreover, it occurs that many fundamental
problems of cryptography are closely connected with generating and test
ing random numbers. For instance, the theoretical possibility to construct
secure pseudorandom generators depends on the existence of oneway func
tions, and one of the attacks to block ciphers is based on statistical tests
designed for detecting deviations from randomness.
In this chapter, we shall consider some problems, ideas, and methods in
cryptography connected with generation and testing of random numbers.
We begin with the main question: what is the random number or, more
generally, the sequence of random numbers? One of the possible definitions
is the following. This is a (binary) sequence obtained by fair coin tossing
if the sides of the coin are marked by 0 and 1. More formally the random
sequence may be defined as a sequence consisting of independent zeroes and
ones whose probabilities are equal. Sometimes such sequences are called
completely random but we shall usually omit the adverb for brevity. Other
random numbers, say, random integers from some interval, can easily be
produced from binary random sequences and we shall not discuss this issue.
169
Basics of Contemporary Cryptography for IT Practitioners
170
The next question is how to generate random numbers in a computing
environment? It is clear that such “manual” methods as coin tossing and
drawing cards can hardly be applied. What is really useful is digitising
the physical processes that are known to contain randomness. These may
be noises in electrical circuits or electronic components (such as diodes),
physical particle counters, or computer mouse motions. One of the main
problems here is the conversion of physical measurements into completely
random sequences of bits. This problem will be addressed in the next
section.
Very often we may (and sometimes shall) use pseudorandom numbers
instead of random. We have already considered the problem of generating
cryptographically secure pseudorandom sequences in Sec. 8.4. We shall
recur to this problem in Sec. 9.3 to discuss some extra issues.
The demands to the quality of random and pseudorandom numbers
in cryptography are extremely high. First of all, any statistical deviations
from the standard must be left out, i.e. zeroes and ones must be equiprob
able and independent. For detecting the deviations, special statistical tests
are used. The US National Institute of Standards and Technology (NIST)
recommends 16 tests for use in cryptography [Rukhin e t al. (2001)l. In
Sec. 9.4 we shall consider some new methods suggested recently that out
perform the NIST tests.
As we have noted in the beginning of the chapter, various branches of
cryptography are usually connected but this is not always obvious. An ex
ample of such connection will be given in the last section where we describe
the attack to block ciphers that is closely connected with random numbers
and statistical tests.
Refining Physical Random Number Generators
9.2
Assume that we have a physical generator that produces the sequence of
random zeroes and ones but the bits of the sequence are not equiprobable
and/or exhibit some correlations. Our task is to convert this sequence into
completely random. This conversion is often referred to as the refinement
of the generator. There are two classes of refinement algorithms. The algo
rithms of the first class are built on the assumption that the source bits are
independent but, perhaps, not equiprobable. The second class algorithms
assume that bits may also be correlated (more exactly, obey some correla
tion model). We shall consider only the first class algorithms. The methods
Random Numbers a Cryptography
n 171
of the second class go beyond the scope of the book for two reasons: first,
for many physical generators (e.g. based on noise diodes), the measure
ments made at distant time instances may be considered as independent;
second, the methods eliminating dependences, as a rule, can do that only
approximately (i.e. the resulting bits are only “almost” independent), their
description is complicated and today they are rarely used in practice. But it
is worth noting that there exist interesting and elegant methods in this field
described in the literature on cryptography and computational complexity
(see, e.g. [Nisan and Zuckerman (1996)] and survey [Nisan and TaShma
(1999)1).
So, in the rest of the section, we consider the methods refining the
sequences of independent but not equiprobable zeroes and ones.
The first algorithm for solving this problem was due to John von Neu
mann (see [Elias (1987)I). For description of the algorithm, introduce the
necessary notation. Let a memoryless source be given generating letters
over the alphabet (0,l) with probabilities 1  p and p, respectively. The
probability p may be unknown. It is required to convert the sequence gen
erated by the source into completely random sequence.
Von Neumann suggested the following method. The source sequence is
split into 2bit blocks that are encoded by the rule
11+A
00+A, OliO, 10+1, (9.1)
where A denotes an empty word.
Example 9.1 Let the source sequence be m = 1101110110. We split it
into %bit blocks (delimited by commas): m = 11,01,11,01,10. NOW we
apply the mapping (9.1) and obtain a completely random sequence Z =
A, 0, A, 0 , l = 001. So, we have L ˜ e ˜ t r a ˜ t e dcompletely random bits from
3”
10bit source sequence.
For justification of the method, notice that the probabilities of the blocks
01 and 10 are equal since they are p(l  p ) and (1  p)p, respectively.
Therefore, the resulting sequence consists of equiprobable zeroes and ones,
as required.
The given example demonstrates the disadvantage of the method defined
by (9.1): the resulting sequence is much shorter than the source one. More
exactly, it is easy to see that given t source symbols we obtain tp(1  p)
output bits on average. For example, if p is close to 1/2 then we obtain, on
average, about t / 4 output bits.
Basics of Contemporary Cryptography for IT Pmctitdoners
172
Peter Elias suggested the conversion method which spends less source
bits [Elias (1987)l. This is achieved by encoding the blocks of length TI,
n > 2 (if n = 2, the methods of Elias and von Neumann coincide). To
measure the effectiveness of the algorithm, Elias introduced the quantity
rln defined as the ratio of the mean length of resulting codeword to the
blocklength n. He showed that the natural upper bound for qn is the
source entropy h(p). Recall from Sec. 7.4, Eq. (7.7) that in binary case,
h(p) = (plogp+ (1 p)log(l  p ) ) . (It is easy to understand why h(p) is
the maximum possible value for the ratio between the lengths of output and
input sequences. Informally, on the one hand, the entropy is a measure of
uncertainty or randomness of the input sequence. On the other hand, the
entropy of the output completely random sequence is equal to its length,
since the entropy of one completely random bit is 1.)
In fact, the idea of Elias™s approach can be illustrated by one algorithm
we have already discussed. Revert to the description of the ideal cipher con
struction in Sec. 7.6. There is a message generated by a binary memoryless
source with unknown statistics (i.e. our input to a refinement algorithm, in
terms of the present section) was transformed to the sequence
where f i i , i = 1 , 2 , . . . , denote the input blocks of n bits and u(.),w(.), w(.)
some transformations, w ( f i L ibeing completely random. Now the output of
)
Elias™s algorithm can be obtained by simply discarding the words u ( f i , ) and
w(fii), i.e. it will consist of concatenated words w ( f i i ) . We may summarise
that using this approach we gain the following refinement conversion of the
source sequence:
Example 9.2 Let the same message as in Example 9.1 be generated,
f i = 1101110110. Split it into 5bit blocks: f i = 11011,10110 where f i 1 =
11011, 6 2 = 10110. Apply to these blocks the coding method described
in Sec. 7.6 (with obvious substitution of a1 for 0 and a2 for 1). We obtain
w(fi1) = ( 1 0 ) ˜ w(fi2) = (110)2, see p. 133. Hence, the source message
and
f i converts to completely random sequence w ( f i 1 ) w ( f i 2 ) :

1101110110 10110.
So, we have extracted 5 completely random bits.
173
Random Numbers in Cryptography
By Proposition 7.5 the average length of each word w(6i) is greater
+
than nh(p)  2log(n 1). Hence we obtain the lower bound for vn,
+ 1) = h(P)  2 log(n + 1)
nh(p)  2 log(n
>
Vn
n
It is clearly seen that if n increases, qn approaches h ( p ) .
So, the Elias algorithm allows for the effectiveness qn to approach the
upper bound (Shannon entropy) if the blocksize n increases. But, at the
same time, it requires the memory size to grow as 2" which makes the
algorithm intractable for large n. In [Ryabko and Matchikina (2000)], a
significantly more efficient method was suggested, with the memory size
growing as n log2n. We refer the interested reader to the cited paper for
more details.
PseudoRandom Number Generators
9.3
In Sec. 8.4, we have already considered pseudorandom number generators
used for constructing stream ciphers and discussed the requirements to
cryptographically secure generators. In this section, we shall briefly touch
on some additional issues.
First, comment on the last two requirements given on p. 161. They claim
that (a) the determination of the next element of the sequence given the
preceding elements should be an infeasible problem and (b) the sequence
produced by the generator should be indistinguishable from a completely
random sequence by any statistical test. It is clear that computational
complexity here is crucial. For example, if we are not computationally
bounded, then any pseudorandom generator can be predicted. Suppose
we have a generated sequence z1,z2,. . . ,zi and wish to predict (to guess)
the zi+l. We know that the whole sequence depends on some initial (secret)
state of generator which can be represented as an integer s. We can try
one value of s after another until such s is found that makes the generator
produce exactly z1, z2,. . . ,zi. Now knowing s we can easily compute zi+1.
If the length of s is I1 bits, this approach requires about 21'1 computations
s
and is infeasible if Is( is about 100 bits or greater.
The problem with practical pseudorandom generators, such as RC4, is
that computational bounds for their security are not determined mathemat
ically strictly. In case of blockcipherbased generators the computational
bound of security is proved to be exponential if the underlying block cipher
Basics of Contemporary Cryptography for IT Practitioners
174
is “ideal”. But, in turn, for practical block ciphers, the correspondence
to the “ideality” conditions is not proved. So, many investigators are en
gaged in searching for new algorithms of statistical testing and prediction in
the hope to find some breaches in the generators™ constructions, and some
recent results in the field will be presented in the following sections.
However, there exists an elegant theory that allows one to connect the
above problem with other branches of cryptography and with complex
ity theory. A detailed exposition of this theory can be found, e.g. in
[Goldwasser and Bellare (2001)] but we shall only point out some basic
results. First, it is shown that the pseudorandom number generators exist
that produce sequences indistinguishable from completely random by any
polynomialtime test. This means, in particular, that such sequences can se
curely substitute for completely random sequences in polynomialtime algo
rithms. Second, it is shown that both requirements to generated sequences,
marked (a) and (b) above, are equivalent if we deal with polynomialtime
algorithms. Third, the connection is established between pseudorandom
number generators and oneway functions which, in particular, allows to
construct “provably secure” generators based on oneway functions. Quo
tation here means that the proofs are valid under certain conditions typical
for complexity theory, e.g. are based on unproved assumptions about one
wayness of certain functions. We shall describe one of such provably secure
generators based on RSA.
The parameters of the RSAbased generator are two large primes P
and Q (P # Q), their product N = PQ, and a number e 2 3 coprime to
(P 1)(Q  1). The initial state xo is a randomly chosen number from the
interval (1,N  1). The generator produces a sequence of bits z1,z2, . . . ,Zk
according to the following scheme:
xi t x:˜ mod N ,
z t least significant bit of z , i 1 , 2 , .. . , k .
i i =
Note that the number e can be 3 which simplifies exponentiation.
The described generator is proved to be cryptographically secure under
the assumption that RSA cannot be broken. This generator is significantly
slower than the generators described in Sec. 8.4 and is therefore unsuitable
for stream ciphers. However, it may be used successfully in other tasks
where high security is paramount. For example, it may be used t o generate
parameters for cryptographic protocols based on a relatively small random
number 50. Other generators that make use of oneway functions may be
Random Numbers i Cryptogmphy
n 175
found in the literature, see, e.g. [Goldwasser and Bellare (2001); Menezes
et al. (1996)].
Statistical Tests for Random and PseudoRandom
9.4
Number Generators
Random and pseudorandom numbers are widely used not only in cryptog