<<

. 23
( 36 .)



>>

measure is nat, in case of decimal logarithms - dit.
If r = 2 , Eq. (7.6) can be written differently with the following notation:
PI= p, Pz = 1 - p. Then


The graph of the entropy for this case is shown in Fig. 7.1.




t:"'

Fig. 7.1 The graph of binary entropy.
Theoretical Security of Cryptosystems 121


Consider the simplest properties of entropy.

Proposition 7.1




Proof. The first property is quite obvious (see (7.6)). We prove the
second property only for r = 2 since the general case is similar. Explore
the graph of entropy. We need to find the maximum of function (7.7). For
that, we find the first and second derivatives of H ( p ) assuming for simplicity
that the logarithms are natural.




Hence H™(p) = 0 if p = 1/2. Find the second derivative




We can see that H”(p) < 0 when p E (0; 1). It means that the function
H ( p ) attains the maximum when p = l / 2 and is convex in the interval
(0; 1). So the graph depicted in Fig. 7.1 is justified. It is plain that under
any other base of the logarithm the graph will be analogous.
The third property follows directly from the definition of entropy (7.6):




The physical sense of the entropy is the quantitative measure of uncer-
tainty. Consider, for instance, three different random variables (1, J2, J3
taking on values a1 and a2 (r = 2):

1 , P(a2) = 0 ;
( : P(u1) =
1
Jz : P ( u ˜ ) 0.5, P(a2) = 0.5 ;
=
J3 : P(u1) = 0.01, P(a2) = 0.99.


Intuition suggests that uncertainty of is zero. Actually,
J1


H ( & ) = - ( l . l o g l + 0 . logo) = 0
122 Basics of Contemporary Cryptography for IT Practitioners


Look at J2 and J3. Intuition says that uncertainty of is higher than that
<2
of &. Compute the entropies:

1 bit
H(J2) =

(has been computed above)

H (63) = - (0.01 . log 0.01 + 0.99 . log 0.99) M 0.08 bit.
We can see that the entropy is indeed a reasonable measure of uncer-
tainty. But the matter is not the examples of that sort. The entropy plays a
key role in many problems of telecommunications and information sciences.
In particular, the entropy characterises the maximum attainable degree of
data compression. More exactly, if a source with the (defined below) limit
entropy h generates a (sufficiently long) message of n symbols, this message
can be compressed up to nh bits on average. For example, if h = 1/2, the
message can be compressed to half of its size. Point out that we speak of
so-called lossless compression when the source message can be recovered
without distortion.
Consider now a two-dimensional random variable defined by the follow-
ing distribution



Introduce the following notation:

c Pij ,
S

Pi. = P(J, = ai) =
j=1
T
C Pij .
P.j = P(J2 = b j ) =
i=l

By analogy with (7.6) define the entropy of two-dimensional random
variable
T S


Pij log Paj .
H(J1,J 2 ) =- (7.9)
i=l j=1
Similarly, for a three-dimensional random variable and probabil-
(J1,52,&)
ity distribution P i j k , define

(7.10)
i j k

The entropy of n-dimensional random variable is defined in the same way.
Theoretical Security of Cryptosystems 123


Suppose now that the value of 51 is known but the value of 52 is not.
Then it is natural to define the conditional entropy

(7.11)
i=l j=1

&
This is the mean conditional entropy of random variable under the con-
dition that the value of & is known.

Proposition 7.2 (properties of two-dimensional entropy)

+ H (52151) , (7.12)
H(51,52) = H ((1)
in particular, for independent random variables & and &,

=H
H(52151) (52) 7


+H . (7.13)
H(51,<2) = H (t) (E2)

Recall from probability theory that &, 5 2 are independent if Pij = PiPj
for all i and j. The proof of the proposition is quite simple and can be
found in the literature. We confine ourselves only with its interpretation.
In the first experiment, let 51 be observed, in the second experiment 5 2 .
Then the total uncertainty must be equal to the uncertainty of the first
experiment summed up with the conditional uncertainty of the second. In
case of independent (1 and 6, the knowledge of one variable does not offer
any information about the value of the other, which corresponds to (7.13).
cn)
Let an n-dimensional random variable (&, C2, . . . , be given. Then
the following equation holds:

m>= H(51) + I1) + H(63 I51,t 2 ) + . . . + H(5nIt1,. . . ,tn- 1) .
H(I1,. . . t
H(t2
7

(7.14)
For independent random variables,
n

. . ,rn) = E m) (7.15)
H(E1,.
i=l

(observe that (7.12), (7.13) are special cases of (7.14), (7.15)).
In general case,


Consider a sequence of random variables 51 ,&,&, . . . (titake on values
in A ) , which may be interpreted as a random process with discrete time.
We shall assume that this process is stationary, i.e. informally, probabilities
Basics of Contemporary Cryptography for I T Practitioners
124


for (Ji, . ,J n ) are the same as for ( J A + ˜ ,...
.. for all positive integers
n and A.
Let H(J1,. . . ,J n ) be the entropy of n-dimensional random variable
( ( I , .. . ,En). Denote by



the entropy rate of the nth order and define

. . ,En-1)
hi =H (JnlE1,.

Note the following properties:

n > 1,
i hi-,, (7.17)
h:
(7.18)
h, 5 h,+,
n > 1. (7.19)
h i 5 hi-,,
& , . . . ,tnthe equalities
For independent el,

hi i =h
=h

hold. (The process generating independent random variables is said to be
the memoryless process.)
Theorem 7 3 For a stationary process, the limits limn+00 h$ and
.
limn-, h; exist and are equal.
Denote the common value of these limits by ha,

h, = lim h: = lim h i . (7.20)
n-00 71-00


u2,.. . a,) be given. We know that
Let the alphabet A = (ul,
. (<I)
max H = log r

for memoryless processes, therefore, taking into account (7.19) and (7.20),
we obtain max h, = logr, the maximum being reached for the memoryless
processes with equal probabilities of letters 1/r. It is natural to introduce
the quantity

R = logr - h, (7.21)
called redundancy (per source letter). Informally, this is, so to speak, “un-
used” portion of the alphabet. Redundancy is a quantitative measure of
mutual dependence and non-uniformity of the letters. Note that in the
125
Theoretical Security of Cryptosystems


second example with the Caesar cipher (p. 5), the redundancy of encrypted
message equals zero since all letters (decimal digits) are equiprobable and
independent, i.e. h, = log10 and R = 0. And this caused the simple
Caesar cipher to become unbreakable.


Unicity Distance for Secret Key Cipher
7.5

Consider the secret-key cryptosystem shown in Fig. 1.1 (p. 3). Let the
source generate the message iiz = m1m2.. . m,. Alice and Bob have the
secret key k, known only to them, and let E = e1e2.. . en be the ciphertext
produced under that key.

Example 7 3 Let the memoryless source generate letters over the alpha-
.
bet A = { a , b, c} with probabilities P ( u ) = 0.8, P(b) = 0.15, P(c) = 0.05.
Let the encrypter substitutes the letters in the message by the other letters
according to some fixed permutation depending on the key k:

( a ,b, C ) k = 1 ,
( a ,c, b ) k = 2 ,
(b, a , c) k = 3 ,
( b , c, a ) k = 4,
(c,a, b) k = 5 ,
(c,b, u ) k = 6 .

That is, there are 6 possible keys (from 1 to 6) and if e.g. k = 5 , the
following substitution is made in the message: a 4 c, b -+a, c -+ b.
Let Eve has eavesdropped the ciphertext


<<

. 23
( 36 .)



>>