. 13
( 137 .)


Data mining results change over time. Models expire and become less useful as
time goes on. One cause is that data ages quickly. Markets and customers
change quickly as well.
Data mining provides feedback into other processes that may need to change.
Decisions made in the business world often affect current processes and
interactions with customers. Often, looking at data finds imperfections in
operational systems, imperfections that should be fixed to enhance future
customer understanding.
The rest of this chapter looks at some more examples of the virtuous cycle of
data mining in action.

A Wireless Communications Company
Makes the Right Connections
The wireless communications industry is fiercely competitive. Wireless phone
companies are constantly dreaming up new ways to steal customers from their
competitors and to keep their own customers loyal. The basic service offering
is a commodity, with thin margins and little basis for product differentiation,
so phone companies think of novel ways to attract new customers.
This case study talks about how one mobile phone provider used data min­
ing to improve its ability to recognize customers who would be attracted to a
new service offering. (We are indebted to Alan Parker of Apower Solutions for
many details in this study.)

The Opportunity
This company wanted to test market a new product. For technical reasons,
their preliminary roll-out tested the product on a few hundred subscribers ”a
tiny fraction of the customer base in the chosen market.
The initial problem, therefore, was to figure out who was likely to be inter­
ested in this new offering. This is a classic application of data mining: finding
the most cost-effective way to reach the desired number of responders. Since
fixed costs of a direct marketing campaign are constant by definition, and the
cost per contact is also fairly constant, the only practical way to reduce the total
cost of the campaign is to reduce the number of contacts.
The company needed a certain number of people to sign up in order for the
trial to be valid. The company™s past experience with new-product introduc­
tion campaigns was that about 2 to 3 percent of existing customers would
respond favorably. So, to reach 500 responders, they would expect to contact
between about 16,000 and 25,000 prospects.
The Virtuous Cycle of Data Mining 35

How should the targets be selected? It would be handy to give each prospec­
tive customer a score from, say, 1 to 100, where 1 means “is very likely to pur­
chase the product” and 100 means “very unlikely to purchase the product.”
The prospects could then be sorted according to this score, and marketing
could work down this list until reaching the desired number of responders. As
the cumulative gains chart in Figure 2.3 illustrates, contacting the people most
likely to respond achieves the quota of responders with fewer contacts, and
hence at a lower cost.
The next chapter explains cumulative gains charts in more detail. For now,
it is enough to know that the curved line is obtained by ordering the scored
prospects along the X-axis with those judged most likely to respond on the left
and those judged least likely on the right. The diagonal line shows what would
happen if prospects were selected at random from all prospects. The chart
shows that good response scores lower the cost of a direct marketing cam­
paign by allowing fewer prospects to be contacted.
How did the mobile phone company get such scores? By data mining, of

How Data Mining Was Applied
Most data mining methods learn by example. The neural network or decision
tree generator or what have you is fed thousands and thousands of training
examples. Each of the training examples is clearly marked as being either a
responder or a nonresponder. After seeing enough of these examples, the tool
comes up with a model in the form of a computer program that reads in
unclassified records and updates each with a response score or classification.
In this case, the offer in question was a new product introduction, so there
was no training set of people who had already responded. One possibility
would be to build a model based on people who had ever responded to any
offer in the past. Such a model would be good for discriminating between peo­
ple who refuse all telemarketing calls and throw out all junk mail, and those
who occasionally respond to some offers. These types of models are called non-
response models and can be valuable to mass mailers who really do want their
message to reach a large, broad market. The AARP, a non-profit organization
that provides services to retired people, saved millions of dollars in mailing
costs when it began using a nonresponse model. Instead of mailing to every
household with a member over 50 years of age, as they once did, they discard
the bottom 10 percent and still get almost all the responders they would have.
However, the wireless company only wanted to reach a few hundred
responders, so a model that identified the top 90 percent would not have
served the purpose. Instead, they formed a training set of records from a simi­
lar new product introduction in another market.
36 Chapter 2


g in

a M d
ize m o









Figure 2.3 Ranking prospects, using a response model, makes it possible to save money
by targeting fewer customers and getting the same number of responders.
The Virtuous Cycle of Data Mining 37

Defining the Inputs
The data mining techniques described in this book automate the central core of
the model building process. Given a collection of input data fields, and a tar­
get field (in this case, purchase of the new product) they can find patterns and
rules that explain the target in terms of the inputs. For data mining to succeed,
there must be some relationship between the input variables and the target.
In practice, this means that it often takes much more time and effort to iden­
tify, locate, and prepare input data than it does to create and run the models,
especially since data mining tools make it so easy to create models. It is impos­
sible to do a good job of selecting input variables without knowledge of the
business problem being addressed. This is true even when using data mining
tools that claim the ability to accept all the data and figure out automatically
which fields are important. Information that knowledgeable people in the
industry expect to be important is often not represented in raw input data in a
way data mining tools can recognize.
The wireless phone company understood the importance of selecting the
right input data. Experts from several different functional areas including
marketing, sales, and customer support met together with outside data mining
consultants to brainstorm about the best way to make use of available data.
There were three data sources available:
A marketing customer information file
A call detail database
A demographic database
The call detail database was the largest of the three by far. It contained a
record for each call made or received by every customer in the target market.
The marketing database contained summarized customer data on usage,
tenure, product history, price plans, and payment history. The third database
contained purchased demographic and lifestyle data about the customers.

Derived Inputs
As a result of the brainstorming meetings and preliminary analysis, several
summary and descriptive fields were added to the customer data to be used as
input to the predictive model:
Minutes of use
Number of incoming calls
Frequency of calls
Sphere of influence
Voice mail user flag
38 Chapter 2

Some of these fields require a bit of explanation. Minutes of use (MOU) is a
standard measure of how good a customer is. The more minutes of use, the
better the customer. Historically, the company had focused on MOU almost to
the exclusion of all other variables. But, MOU masks many interesting differ­
ences: 2 long calls or 100 short ones? All outgoing calls or half incoming? All
calls to the same number or calls to many numbers? The next items in the
above list are intended to shed more light on these questions.
Sphere of influence (SOI) is another interesting measure because it was
developed as a result of an earlier data mining effort. A customer™s SOI is the
number of people with whom she or he had phone conversations during a
given time period. It turned out that high SOI customers behaved differently,
as a group, than low SOI customers in several ways including frequency of
calls to customer service and loyalty.

The Actions
Data from all three sources was brought together and used to create a data
mining model. The model was used to identify likely candidates for the new
product. Two direct mailings were made: one to a list based on the results of
the data mining model and one to control group selected using business-
as-usual methods. As shown in Figure 2.4, 15 percent of the people in the
target group purchased the new product, compared to only 3 percent in the
control group.

15% 3%

Percent of Target Market Responding Percent of Control Group Responding
Figure 2.4 These results demonstrate a very successful application of data mining.
The Virtuous Cycle of Data Mining 39

Completing the Cycle
With the help of data mining, the right group of prospects was contacted for
the new product offering. That is not the end of the story, though. Once the
results of the new campaign were in, data mining techniques could help to get
a better picture of the actual responders. Armed with a buyer profile of the
buyers in the initial test market, and a usage profile of the first several months
of the new service, the company was able to do an even better job of targeting
prospects in the next five markets where the product was rolled out.

Neural Networks and Decision Trees Drive SUV Sales
In 1992, before any of the commercial data mining tools available today were
on the market, one of the big three U.S. auto makers asked a group of
researchers at the Pontikes Center for Management at Southern Illinois Uni­
versity in Carbondale to develop an “expert system” to identify likely buyers
of a particular sport-utility vehicle. (We are grateful to Wei-Xiong Ho who


. 13
( 137 .)