. 22
( 137 .)


Data Mining Methodology and Best Practices 83

%Captured Response









10 20 30 40 50 60 70 80 90 100


Figure 3.13 Cumulative response for targeted mailing compared with mass mailing.

Problems with Lift
Lift solves the problem of how to compare the performance of models of dif­
ferent kinds, but it is still not powerful enough to answer the most important
questions: Is the model worth the time, effort, and money it cost to build it?
Will mailing to a segment where lift is 3 result in a profitable campaign?
These kinds of questions cannot be answered without more knowledge of
the business context, in order to build costs and revenues into the calculation.
Still, lift is a very handy tool for comparing the performance of two models
applied to the same or comparable data. Note that the performance of two
models can only be compared using lift when the tests sets have the same den­
sity of the outcome.
84 Chapter 3

Lift Value






10 20 30 40 50 60 70 80 90 100
Figure 3.14 A lift chart starts high and then goes to 1.

Step Nine: Deploy Models
Deploying a model means moving it from the data mining environment to the
scoring environment. This process may be easy or hard. In the worst case (and
we have seen this at more than one company), the model is developed in a spe­
cial modeling environment using software that runs nowhere else. To deploy
the model, a programmer takes a printed description of the model and recodes
it in another programming language so it can be run on the scoring platform.
A more common problem is that the model uses input variables that are not
in the original data. This should not be a problem since the model inputs are at
least derived from the fields that were originally extracted to from the model
set. Unfortunately, data miners are not always good about keeping a clean,
reusable record of the transformations they applied to the data.
The challenging in deploying data mining models is that they are often used
to score very large datasets. In some environments, every one of millions of cus­
tomer records is updated with a new behavior score every day. A score is sim­
ply an additional field in a database table. Scores often represent a probability
or likelihood so they are typically numeric values between 0 and 1, but by no
Data Mining Methodology and Best Practices 85

means necessarily so. A score might also be a class label provided by a cluster­
ing model, for instance, or a class label with a probability.

Step Ten: Assess Results
The response chart in Figure 3.14compares the number of responders reached
for a given amount of postage, with and without the use of a predictive model.
A more useful chart would show how many dollars are brought in for a given
expenditure on the marketing campaign. After all, if developing the model is
very expensive, a mass mailing may be more cost-effective than a targeted one.
What is the fixed cost of setting up the campaign and the model that

supports it?
What is the cost per recipient of making the offer?

What is the cost per respondent of fulfilling the offer?

What is the value of a positive response?

Plugging these numbers into a spreadsheet makes it possible to measure the
impact of the model in dollars. The cumulative response chart can then be
turned into a cumulative profit chart, which determines where the sorted mail­
ing list should be cut off. If, for example, there is a high fixed price of setting
up the campaign and also a fairly high price per recipient of making the offer
(as when a wireless company buys loyalty by giving away mobile phones or
waiving renewal fees), the company loses money by going after too few
prospects because, there are still not enough respondents to make up for the
high fixed costs of the program. On the other hand, if it makes the offer to too
many people, high variable costs begin to hurt.
Of course, the profit model is only as good as its inputs. While the fixed and
variable costs of the campaign are fairly easy to come by, the predicted value
of a responder can be harder to estimate. The process of figuring out what a
customer is worth is beyond the scope of this book, but a good estimate helps
to measure the true value of a data mining model.
In the end, the measure that counts the most is return on investment. Mea­
suring lift on a test set helps choose the right model. Profitability models based
on lift will help decide how to apply the results of the model. But, it is very
important to measure these things in the field as well. In a database marketing
application, this requires always setting aside control groups and carefully
tracking customer response according to various model scores.

Step Eleven: Begin Again
Every data mining project raises more questions than it answers. This is a good
thing. It means that new relationships are now visible that were not visible
86 Chapter 3

before. The newly discovered relationships suggest new hypotheses to test
and the data mining process begins all over again.

Lessons Learned
Data mining comes in two forms. Directed data mining involves searching
through historical records to find patterns that explain a particular outcome.
Directed data mining includes the tasks of classification, estimation, predic­
tion, and profiling. Undirected data mining searches through the same records
for interesting patterns. It includes the tasks of clustering, finding association
rules, and description.
Data mining brings the business closer to data. As such, hypothesis testing
is a very important part of the process. However, the primary lesson of this
chapter is that data mining is full of traps for the unwary and following a
methodology based on experience can help avoid them.
The first hurdle is translating the business problem into one of the six tasks
that can be solved by data mining: classification, estimation, prediction, affin­
ity grouping, clustering, and profiling.
The next challenge is to locate appropriate data that can be transformed into
actionable information. Once the data has been located, it should be thoroughly
explored. The exploration process is likely to reveal problems with the data. It
will also help build up the data miner™s intuitive understanding of the data.
The next step is to create a model set and partition it into training, validation,
and test sets.
Data transformations are necessary for two purposes: to fix problems with
the data such as missing values and categorical variables that take on too
many values, and to bring information to the surface by creating new variables
to represent trends and other ratios and combinations.
Once the data has been prepared, building models is a relatively easy
process. Each type of model has its own metrics by which it can be assessed,
but there are also assessment tools that are independent of the type of model.
Some of the most important of these are the lift chart, which shows how the
model has increased the concentration of the desired value of the target vari­
able and the confusion matrix that shows that misclassification error rate for
each of the target classes. The next chapter uses examples from real data min­
ing projects to show the methodology in action.

Data Mining Applications in
Marketing and Customer
Relationship Management

Some people find data mining techniques interesting from a technical per­
spective. However, for most people, the techniques are interesting as a means
to an end. The techniques do not exist in a vacuum; they exist in a business
context. This chapter is about the business context.
This chapter is organized around a set of business objectives that can be
addressed by data mining. Each of the selected business objectives is linked to
specific data mining techniques appropriate for addressing the problem. The
business topics addressed in this chapter are presented in roughly ascending
order of complexity of the customer relationship. The chapter starts with the
problem of communicating with potential customers about whom little is
known, and works up to the varied data mining opportunities presented by
ongoing customer relationships that may involve multiple products, multiple
communications channels, and increasingly individualized interactions.
In the course of discussing the business applications, technical material is
introduced as appropriate, but the details of specific data mining techniques
are left for later chapters.

Prospecting seems an excellent place to begin a discussion of business appli­
cations of data mining. After all, the primary definition of the verb to prospect

88 Chapter 4

comes from traditional mining, where it means to explore for mineral deposits or
oil. As a noun, a prospect is something with possibilities, evoking images of oil
fields to be pumped and mineral deposits to be mined. In marketing, a prospect
is someone who might reasonably be expected to become a customer if
approached in the right way. Both noun and verb resonate with the idea of
using data mining to achieve the business goal of locating people who will be
valuable customers in the future.
For most businesses, relatively few of Earth™s more than six billion people
are actually prospects. Most can be excluded based on geography, age, ability
to pay, and need for the product or service. For example, a bank offering home
equity lines of credit would naturally restrict a mailing offering this type of
loan to homeowners who reside in jurisdictions where the bank is licensed to
operate. A company selling backyard swing sets would like to send its catalog
to households with children at addresses that seem likely to have backyards. A
magazine wants to target people who read the appropriate language and will
be of interest to its advertisers. And so on.
Data mining can play many roles in prospecting. The most important of
these are:
Identifying good prospects

Choosing a communication channel for reaching prospects

Picking appropriate messages for different groups of prospects

Although all of these are important, the first”identifying good prospects”
is the most widely implemented.


. 22
( 137 .)