. 32
( 36 .)


on the claim that the key sequence consists of equiprobable and indepen-
dent bits. Finally, we have shown how pseudo-random sequences are used
in stream ciphers to encrypt messages. It is natural then that the problem
of generating random and pseudo-random numbers is of great interest for
the designers of cryptosystems. Moreover, it occurs that many fundamental
problems of cryptography are closely connected with generating and test-
ing random numbers. For instance, the theoretical possibility to construct
secure pseudo-random generators depends on the existence of one-way func-
tions, and one of the attacks to block ciphers is based on statistical tests
designed for detecting deviations from randomness.
In this chapter, we shall consider some problems, ideas, and methods in
cryptography connected with generation and testing of random numbers.
We begin with the main question: what is the random number or, more
generally, the sequence of random numbers? One of the possible definitions
is the following. This is a (binary) sequence obtained by fair coin tossing
if the sides of the coin are marked by 0 and 1. More formally the random
sequence may be defined as a sequence consisting of independent zeroes and
ones whose probabilities are equal. Sometimes such sequences are called
completely random but we shall usually omit the adverb for brevity. Other
random numbers, say, random integers from some interval, can easily be
produced from binary random sequences and we shall not discuss this issue.

Basics of Contemporary Cryptography for IT Practitioners

The next question is how to generate random numbers in a computing
environment? It is clear that such “manual” methods as coin tossing and
drawing cards can hardly be applied. What is really useful is digitising
the physical processes that are known to contain randomness. These may
be noises in electrical circuits or electronic components (such as diodes),
physical particle counters, or computer mouse motions. One of the main
problems here is the conversion of physical measurements into completely
random sequences of bits. This problem will be addressed in the next
Very often we may (and sometimes shall) use pseudo-random numbers
instead of random. We have already considered the problem of generating
cryptographically secure pseudo-random sequences in Sec. 8.4. We shall
recur to this problem in Sec. 9.3 to discuss some extra issues.
The demands to the quality of random and pseudo-random numbers
in cryptography are extremely high. First of all, any statistical deviations
from the standard must be left out, i.e. zeroes and ones must be equiprob-
able and independent. For detecting the deviations, special statistical tests
are used. The US National Institute of Standards and Technology (NIST)
recommends 16 tests for use in cryptography [Rukhin e t al. (2001)l. In
Sec. 9.4 we shall consider some new methods suggested recently that out-
perform the NIST tests.
As we have noted in the beginning of the chapter, various branches of
cryptography are usually connected but this is not always obvious. An ex-
ample of such connection will be given in the last section where we describe
the attack to block ciphers that is closely connected with random numbers
and statistical tests.

Refining Physical Random Number Generators

Assume that we have a physical generator that produces the sequence of
random zeroes and ones but the bits of the sequence are not equiprobable
and/or exhibit some correlations. Our task is to convert this sequence into
completely random. This conversion is often referred to as the refinement
of the generator. There are two classes of refinement algorithms. The algo-
rithms of the first class are built on the assumption that the source bits are
independent but, perhaps, not equiprobable. The second class algorithms
assume that bits may also be correlated (more exactly, obey some correla-
tion model). We shall consider only the first class algorithms. The methods
Random Numbers a Cryptography
n 171

of the second class go beyond the scope of the book for two reasons: first,
for many physical generators (e.g. based on noise diodes), the measure-
ments made at distant time instances may be considered as independent;
second, the methods eliminating dependences, as a rule, can do that only
approximately (i.e. the resulting bits are only “almost” independent), their
description is complicated and today they are rarely used in practice. But it
is worth noting that there exist interesting and elegant methods in this field
described in the literature on cryptography and computational complexity
(see, e.g. [Nisan and Zuckerman (1996)] and survey [Nisan and Ta-Shma
So, in the rest of the section, we consider the methods refining the
sequences of independent but not equiprobable zeroes and ones.
The first algorithm for solving this problem was due to John von Neu-
mann (see [Elias (1987)I). For description of the algorithm, introduce the
necessary notation. Let a memoryless source be given generating letters
over the alphabet (0,l) with probabilities 1 - p and p, respectively. The
probability p may be unknown. It is required to convert the sequence gen-
erated by the source into completely random sequence.
Von Neumann suggested the following method. The source sequence is
split into 2-bit blocks that are encoded by the rule

00-+A, Ol-iO, 10+1, (9.1)

where A denotes an empty word.

Example 9.1 Let the source sequence be m = 1101110110. We split it
into %bit blocks (delimited by commas): m = 11,01,11,01,10. NOW we
apply the mapping (9.1) and obtain a completely random sequence Z =
A, 0, A, 0 , l = 001. So, we have L ˜ e ˜ t r a ˜ t e dcompletely random bits from
10-bit source sequence.

For justification of the method, notice that the probabilities of the blocks
01 and 10 are equal since they are p(l - p ) and (1 - p)p, respectively.
Therefore, the resulting sequence consists of equiprobable zeroes and ones,
as required.
The given example demonstrates the disadvantage of the method defined
by (9.1): the resulting sequence is much shorter than the source one. More
exactly, it is easy to see that given t source symbols we obtain tp(1 - p)
output bits on average. For example, if p is close to 1/2 then we obtain, on
average, about t / 4 output bits.
Basics of Contemporary Cryptography for IT Pmctitdoners

Peter Elias suggested the conversion method which spends less source
bits [Elias (1987)l. This is achieved by encoding the blocks of length TI,
n > 2 (if n = 2, the methods of Elias and von Neumann coincide). To
measure the effectiveness of the algorithm, Elias introduced the quantity
rln defined as the ratio of the mean length of resulting codeword to the
blocklength n. He showed that the natural upper bound for qn is the
source entropy h(p). Recall from Sec. 7.4, Eq. (7.7) that in binary case,
h(p) = -(plogp+ (1 -p)log(l - p ) ) . (It is easy to understand why h(p) is
the maximum possible value for the ratio between the lengths of output and
input sequences. Informally, on the one hand, the entropy is a measure of
uncertainty or randomness of the input sequence. On the other hand, the
entropy of the output completely random sequence is equal to its length,
since the entropy of one completely random bit is 1.)
In fact, the idea of Elias™s approach can be illustrated by one algorithm
we have already discussed. Revert to the description of the ideal cipher con-
struction in Sec. 7.6. There is a message generated by a binary memoryless
source with unknown statistics (i.e. our input to a refinement algorithm, in
terms of the present section) was transformed to the sequence

where f i i , i = 1 , 2 , . . . , denote the input blocks of n bits and u(.),w(.), w(.)
some transformations, w ( f i L ibeing completely random. Now the output of
Elias™s algorithm can be obtained by simply discarding the words u ( f i , ) and
w(fii), i.e. it will consist of concatenated words w ( f i i ) . We may summarise
that using this approach we gain the following refinement conversion of the
source sequence:

Example 9.2 Let the same message as in Example 9.1 be generated,
f i = 1101110110. Split it into 5-bit blocks: f i = 11011,10110 where f i 1 =
11011, 6 2 = 10110. Apply to these blocks the coding method described
in Sec. 7.6 (with obvious substitution of a1 for 0 and a2 for 1). We obtain
w(fi1) = ( 1 0 ) ˜ w(fi2) = (110)2, see p. 133. Hence, the source message
f i converts to completely random sequence w ( f i 1 ) w ( f i 2 ) :

1101110110 10110.

So, we have extracted 5 completely random bits.
Random Numbers in Cryptography

By Proposition 7.5 the average length of each word w(6i) is greater
than nh(p) - 2log(n 1). Hence we obtain the lower bound for vn,

+ 1) = h(P) - 2 log(n + 1)
nh(p) - 2 log(n
It is clearly seen that if n increases, qn approaches h ( p ) .
So, the Elias algorithm allows for the effectiveness qn to approach the
upper bound (Shannon entropy) if the blocksize n increases. But, at the
same time, it requires the memory size to grow as 2" which makes the
algorithm intractable for large n. In [Ryabko and Matchikina (2000)], a
significantly more efficient method was suggested, with the memory size
growing as n log2n. We refer the interested reader to the cited paper for
more details.

Pseudo-Random Number Generators

In Sec. 8.4, we have already considered pseudo-random number generators
used for constructing stream ciphers and discussed the requirements to
cryptographically secure generators. In this section, we shall briefly touch
on some additional issues.
First, comment on the last two requirements given on p. 161. They claim
that (a) the determination of the next element of the sequence given the
preceding elements should be an infeasible problem and (b) the sequence
produced by the generator should be indistinguishable from a completely
random sequence by any statistical test. It is clear that computational
complexity here is crucial. For example, if we are not computationally
bounded, then any pseudo-random generator can be predicted. Suppose
we have a generated sequence z1,z2,. . . ,zi and wish to predict (to guess)
the zi+l. We know that the whole sequence depends on some initial (secret)
state of generator which can be represented as an integer s. We can try
one value of s after another until such s is found that makes the generator
produce exactly z1, z2,. . . ,zi. Now knowing s we can easily compute zi+1.
If the length of s is I1 bits, this approach requires about 21'1 computations
and is infeasible if Is( is about 100 bits or greater.
The problem with practical pseudo-random generators, such as RC4, is
that computational bounds for their security are not determined mathemat-
ically strictly. In case of block-cipher-based generators the computational
bound of security is proved to be exponential if the underlying block cipher
Basics of Contemporary Cryptography for IT Practitioners

is “ideal”. But, in turn, for practical block ciphers, the correspondence
to the “ideality” conditions is not proved. So, many investigators are en-
gaged in searching for new algorithms of statistical testing and prediction in
the hope to find some breaches in the generators™ constructions, and some
recent results in the field will be presented in the following sections.
However, there exists an elegant theory that allows one to connect the
above problem with other branches of cryptography and with complex-
ity theory. A detailed exposition of this theory can be found, e.g. in
[Goldwasser and Bellare (2001)] but we shall only point out some basic
results. First, it is shown that the pseudo-random number generators exist
that produce sequences indistinguishable from completely random by any
polynomial-time test. This means, in particular, that such sequences can se-
curely substitute for completely random sequences in polynomial-time algo-
rithms. Second, it is shown that both requirements to generated sequences,
marked (a) and (b) above, are equivalent if we deal with polynomial-time
algorithms. Third, the connection is established between pseudo-random
number generators and one-way functions which, in particular, allows to
construct “provably secure” generators based on one-way functions. Quo-
tation here means that the proofs are valid under certain conditions typical
for complexity theory, e.g. are based on unproved assumptions about one-
wayness of certain functions. We shall describe one of such provably secure
generators based on RSA.
The parameters of the RSA-based generator are two large primes P
and Q (P # Q), their product N = PQ, and a number e 2 3 coprime to
(P- 1)(Q - 1). The initial state xo is a randomly chosen number from the
interval (1,N - 1). The generator produces a sequence of bits z1,z2, . . . ,Zk
according to the following scheme:

xi t x:-˜ mod N ,
z t least significant bit of z , i 1 , 2 , .. . , k .
i i =

Note that the number e can be 3 which simplifies exponentiation.
The described generator is proved to be cryptographically secure under
the assumption that RSA cannot be broken. This generator is significantly
slower than the generators described in Sec. 8.4 and is therefore unsuitable
for stream ciphers. However, it may be used successfully in other tasks
where high security is paramount. For example, it may be used t o generate
parameters for cryptographic protocols based on a relatively small random
number 50. Other generators that make use of one-way functions may be
Random Numbers i Cryptogmphy
n 175

found in the literature, see, e.g. [Goldwasser and Bellare (2001); Menezes
et al. (1996)].

Statistical Tests for Random and Pseudo-Random
Number Generators

Random and pseudo-random numbers are widely used not only in cryptog-


. 32
( 36 .)