<<

. 86
( 137 .)



>>

Holland calls this property implicit parallelism. The computational effort for a
genetic algorithm is proportional to the size of the population, and in this
effort, the algorithm is usefully processing a number of schemata proportional
to n3. The property of implicit parallelism should not be confused with explicit
parallelism that is available when running genetic algorithms on a distributed
network of workstations or on a computer with multiple processors.
The Schema Theorem gives us insight into why genomes work better when
there are only two symbols (0s and 1s) in the representation. Finding the best
building blocks requires processing as many schemata as possible from one
generation to the next. For two symbols, the number of different genomes of a
given length is 2length and the number of different schemata is 3length. Roughly,
the number of unique schemata being processed by a single genome is about
1.5length. Now, what happens if there are more symbols in the alphabet, say by
adding 2 and 3? Now the number of genomes of a given length is 4length, and
the number of different schemata is 5length (because the asterisk adds one more
symbol). Although there are more schemata, the number of schemata corre­
sponding to a given genome is only 1.25length. As the number of symbols
increases, the relative number of schemata decreases. Another way of looking
at this is to consider the schema *00. If there are only two letters in the alpha­
bet, then only two genomes process this schema, 000 and 100. If there are four
letters, then there are four genomes: 000, 100, 200, and 300. Since genetic algo­
rithms are trying to find the best schemata using a given population size, the
additional genomes do not help the search.
Schemata are the building blocks of the solutions, and using only two sym­
bols allows the maximum number of schemata to be represented in a given
population size. These estimates are not exact, but they are suggestive. More
rigorous treatment confirms the result that an alphabet of two symbols is opti­
mal from the point of view of processing schemata.


More Applications of Genetic Algorithms
Genetic algorithms have been used to solve some real problems. This section
talks about two applications of genetic algorithms. The first is their application
to neural networks and the second is their application to predictive modeling.
Genetic Algorithms 439


Application to Neural Networks
Neural networks and genetic algorithms are natural allies. One of the
strengths of genetic algorithms is their ability to work on black boxes”that is,
on problems where the fitness function is available but the details of the calcu­
lations are not known. The use of genetic algorithms to train neural networks
is a good example, although this method of training is not common.
Figure 13.4 illustrates a simple neural network with three input nodes, a
hidden layer with two nodes, and a single output node. The key to making the
network work well is adjusting the weights on its edges so that the output pro­
duces the right answer for appropriate inputs. Chapter 7 discussed the nature
of the functions inside the nodes and how standard training algorithms pro­
ceed. For the current discussion, all we need is that the network can produce
an output for any given set of weights and inputs. The weights are real num­
bers, and there is a training set that includes inputs and the corresponding cor­
rect output.


input 1
0
w 03
w04

3
w
input 2 output
13 w 35
1 5
w14
w45
4
w23
w24
input 3
2


Without really even understanding how a neural network works, the weights
can be gathered into a genome so a genetic algorithm can optimize them.



w03 w04 w13 w14 w23 w24 w35 w45




Each weight gets
10110001 represented by some
number of bits.
Figure 13.4 A neural network is described by weights that genetic algorithms
can optimize.
440 Chapter 13


The first problem faced is to determine what a genome looks like. The
genome consists of all the weights in the network glommed together. What is
the fitness function? The fitness function creates a network using the weights
and then applies this model to the training set. The fitness function then com­
pares the predicted output of this neural network to the actual output; hence,
the fitness function is defined as the overall error of the neural network with
those weights on the training set. The genetic algorithm proceeds by minimiz­
ing this function.
Another application to neural networks is determining the topology of the
network”how many nodes should be in the hidden layer and what activation
functions should be used. Different topologies can be described using different
sets of weights, and then the genetic algorithms can proceed to find the best
ones. In this case, the fitness function creates the network described by the
genome, and then trains the network using standard training methods, and
the error from the best network is used for the fitness function. This is an
example of genetic algorithms being used to evolve optimal solutions for a
complicated problem.


Case Study: Evolving a Solution for Response Modeling
A more interesting use of genetic algorithms is to solve a real business prob­
lem. Direct feedback from customers is a powerful source of information for
businesses. When a customer makes a complaint, the company has an oppor­
tunity to make a good impression by fixing the problem promptly or, if it is too
late for that, by making up for the problem somehow. For some companies,
such as product goods manufacturers, complaints provide dates of actual
product use”a bit of additional information to add to manufacturing and
shipping dates. Customer complaints also hand companies an opportunity to
improve processes so that they have fewer dissatisfied customers in the future.
In our work building retention models for mobile phone companies, we
have seen situations where customers who make calls to customer service are
more loyal than other customers. Apparently, responding to the expressed
needs of customers can make them happier and more loyal, especially when
the response is prompt and appropriate. At another mobile phone company,
calls to customer service indicated a higher probability of churn, due no doubt
to the long wait periods at their call centers.
This case study talks about classifying complaints for routing complaints
versus compliments, using the ideas of genetic algorithms.

Business Context
The custom service department of a major international airline processes
many customer comments, which arrive via several channels:
Genetic Algorithms 441


Response cards included in the in-flight magazine
––

Comment forms on the airline™s Web site
––

Telephone calls to the customer service center
––

Cards, letters, and email messages
––


Different comments have different priorities for responses. Compliments,
for example, may result in an automated “thank you for being a loyal cus­
tomer” type of message. On the other hand, all complaints need at least to be
acknowledged, and many complaints require follow-up action. The sooner the
company responds, the better the chance of keeping a perhaps valuable, but
disgruntled, customer.
Airline personnel spend significant amounts of time analyzing customer
comments, first sorting them into complaints and other comments, and then
routing the complaints to the appropriate group for follow-up. When cus­
tomers are already upset about lost baggage, canceled flights, rude treatment,
or lousy food, a slow or inappropriate response only makes things worse. This
particular airline decided to reduce the time it took to respond to a complaint
by automating the initial categorization of comments. Their approach evolved
a solution using software from Genalytics (www.genalytics.com), a software
company in Newburyport, MA.

Data
All customer comments end up in a comment database, regardless of the chan­
nel they come in by. This database includes both fixed fields describing the
comment and the actual text itself. A complete customer comment record has
the following fields:
Date
––

Source (email, comment card, telephone contact, letter, other)
––

Flight number
––

Class of service
––

Departure airport
––

Destination airport
––

Mileage account number
––

Organization receiving comment
––

Names of involved airline employee(s) if mentioned
––

Free-text comments
––


Some records are missing data for some fields. Comments coming in through
the call center are usually filled in correctly, because the call-center reps are
442 Chapter 13


trained to fill in all the fields. However, left to themselves, customers may not
fill in all the fields when sending a comment card or email message.
The first step is preprocessing the text. The company preprocesses the com­
ments to correct certain spelling errors and to create a large number of derived
variables about the content (is the word “food” present? is the word “meal”
present? and so on). Such derived variables are created for every word in the
database that occurs more than some threshold number of times across all
messages and is not a very common word such as “of” or “the.” Some of the
new variables convey metadata about the comment, such as its size in bytes
and the number of distinct words in contains. Together, these variables form
the comment header. The comment itself is not used, instead the various
derived variables are used.




Y
The Data Mining Task: Evolving a Solution



FL
The data mining task was to come up with a model that takes as input a large
number of variables describing each customer comment and somehow com­
AM
bine them to come up with a classification. The specific task was to classify
comment signatures based on whether or not they are complaints. There are
several ways of approaching this, such as using decision trees or clustering. In
TE

this case, though, the company evolved a solution.
Solving a problem with genetic algorithms requires genomes and a fitness
function. The genomes are based on the preprocessed comments, one genome
per comment. First, a few more fields are added for interaction variables, such
as whether both “baggage” and “JFK” are mentioned or whether both “food”
and “chicken” are mentioned. The header, metadata variables and interaction
variables form the comment signature, as shown in Figure 13.5.


To: comments@airline.com
From: random_customer

. . . My baggage was lost at JFK when I changed planes . . .
d”
ge

l”





an
od




ea





<<

. 86
( 137 .)



>>