. 25
( 137 .)


Concentration (% of Responders)








Response Model
No Model
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

List Penetration (% of Prospects)

Figure 4.2 A cumulative gains or concentration chart shows the benefit of using a model.
100 Chapter 4

The upper, curved line plots the concentration, the percentage of all respon­
ders captured as more and more of the prospects are included in the campaign.
The straight diagonal line is there for comparison. It represents what happens
with no model so the concentration does not vary as a function of penetration.
Mailing to 30 percent of the prospects chosen at random would find 30 percent
of the responders. With the model, mailing to the top 30 percent of prospects
finds 65 percent of the responders. The ratio of concentration to penetration is
the lift. The difference between these two lines is the benefit. Lift was discussed
in the previous chapter. Benefit is discussed in a sidebar.
The model pictured here has lift of 2.17 at the third decile, meaning that
using the model, SAC will get twice as many responders for its expenditure of
$300,000 than it would have received by mailing to 30 percent of its one million
prospects at random.

Optimizing Campaign Profitability
There is no doubt that doubling the response rate to a campaign is a desirable
outcome, but how much is it actually worth? Is the campaign even profitable?
Although lift is a useful way of comparing models, it does not answer these
important questions. To address profitability, more information is needed. In
particular, calculating profitability requires information on revenues as well as
costs. Let™s add a few more details to the SAC example.
The Simplifying Assumptions Corporation sells a single product for a
single price. The price of the product is $100. The total cost to SAC to manu­
facture, warehouse and distribute the product is $55 dollars. As already
mentioned, it costs one dollar to reach a prospect. There is now enough
information to calculate the value of a response. The gross value of each
response is $100. The net value of each response takes into account the costs
associated with the response ($55 for the cost of goods and $1 for the contact)
to achieve net revenue of $44 per response. This information is summarized in
Table 4.3.

Table 4.3 Profit/Loss Matrix for the Simplifying Assumptions Corporation


Yes No

Yes $44 $“1
No $0 $0
Data Mining Applications 101


Concentration charts, such as the one pictured in Figure 4.2, are usually
discussed in terms of lift. Lift measures the relationship of concentration to
penetration and is certainly a useful way of comparing the performance of two
models at a given depth in the prospect list. However, it fails to capture another
concept that seems intuitively important when looking at the chart”namely,
how far apart are the lines, and at what penetration are they farthest apart?
Our colleague, the statistician Will Potts, gives the name benefit to the
difference between concentration and penetration. Using his nomenclature, the
point where this difference is maximized is the point of maximum benefit. Note
that the point of maximum benefit does not correspond to the point of highest
lift. Lift is always maximized at the left edge of the concentration chart where
the concentration is highest and the slope of the curve is steepest.
The point of maximum benefit is a bit more interesting. To explain some of
its useful properties this sidebar makes reference to some things (such ROC
curves and KS tests) that are not explained in the main body of the book. Each
bulleted point is a formal statement about the maximum benefit point on the
concentration curve. The formal statements are followed by informal
— The maximum benefit is proportional to the maximum distance between
the cumulative distribution functions of the probabilities in each class.
What this means is that the model score that cuts the prospect list at the
penetration where the benefit is greatest is also the score that maximizes
the Kolmogorov-Smirnov (KS) statistic. The KS test is popular among some
statisticians, especially in the financial services industry. It was developed
as a test of whether two distributions are different. Splitting the list at the
point of maximum benefit results in a “good list” and a “bad list” whose
distributions of responders are maximally separate from each other and
from the population. In this case, the “good list” has a maximum propor­
tion of responders and the “bad list” has a minimum proportion.
— The maximum benefit point on the concentration curve corresponds to
the maximum perpendicular distance between the corresponding ROC
curve and the no-model line.
The ROC curve resembles the more familiar concentration or cumulative
gains chart, so it is not surprising that there is a relationship between them. As
explained in another sidebar, the ROC curve shows the trade-off between two
types of misclassification error. The maximum benefit point on the cumulative
gains chart corresponds to a point on the ROC curve where the separation
between the classes is maximized.
— The maximum benefit point corresponds to the decision rule that maxi­
mizes the unweighted average of sensitivity and specificity.
102 Chapter 4

BENEFIT (continued)

As used in the medical world, sensitivity is the proportion of true posi­
tives among people who get a positive result on a test. In other words, it
is the true positives divided by the sum of the true positives and false
positives. Sensitivity measures the likelihood that a diagnosis based on
the test is correct. Specificity is the proportion of true negatives among
people who get a negative result on the test. A good test should be both
sensitive and specific. The maximum benefit point is the cutoff that max­
imizes the average of these two measures. In Chapter 8, these concepts
go by the names recall and precision, the terminology used in informa­
tion retrieval. Recall measures the number of articles on the correct topic
returned by a Web search or other text query. Precision measures the

percentage of the returned articles that are on the correct topic.
— The maximum benefit point corresponds to a decision rule that mini­

mizes the expected loss assuming the misclassification costs are in­
versely proportional to the prevalence of the target classes.
One way of evaluating classification rules is to assign a cost to each type
of misclassification and compare rules based on that cost. Whether they
represent responders, defaulters, fraudsters, or people with a particular
disease, the rare cases are generally the most interesting so missing one of

them is more costly than misclassifying one of the common cases. Under
that assumption, the maximum benefit picks a good classification rule.

This table says that if a prospect is contacted and responds, the company
makes forty-four dollars. If a prospect is contacted, but fails to respond, the
company loses $1. In this simplified example, there is neither cost nor benefit
in choosing not to contact a prospect. A more sophisticated analysis might take
into account the fact that there is an opportunity cost to not contacting a
prospect who would have responded, that even a nonresponder may become
a better prospect as a result of the contact through increased brand awareness,
and that responders may have a higher lifetime value than indicated by the
single purchase. Apart from those complications, this simple profit and loss
matrix can be used to translate the response to a campaign into a profit figure.
Ignoring campaign overhead fixed costs, if one prospect responds for every 44
who fail to respond, the campaign breaks even. If the response rate is better
than that, the campaign is profitable.

WA R N I N G If the cost of a failed contact is set too low, the profit and loss
matrix suggests contacting everyone. This may not be a good idea for other
reasons. It could lead to prospects being bombarded with innapropriate offers.

Data Mining Applications 103

How the Model Affects Profitability
How does the model whose lift and benefit are characterized by Figure 4.2
affect the profitability of a campaign? The answer depends on the start-up cost
for the campaign, the underlying prevalence of responders in the population
and on the cutoff penetration of people contacted. Recall that SAC had a bud­
get of $300,000. Assume that the underlying prevalence of responders in the
population is 1 percent. The budget is enough to contact 300,000 prospects, or
30 percent of the prospect pool. At a depth of 30 percent, the model provides lift
of about 2, so SAC can expect twice as many responders as they would have
without the model. In this case, twice as many means 2 percent instead of 1 per­
cent, yielding 6,000 (2% * 300,000) responders each of whom is worth $44 in net
revenue. Under these assumptions, SAC grosses $600,000 and nets $264,000
from responders. Meanwhile, 98 percent of prospects or 294,000 do not
respond. Each of these costs a dollar, so SAC loses $30,000 on the campaign.
Table 4.4 shows the data used to generate the concentration chart in Figure
4.2. It suggests that the campaign could be made profitable by spending less
money to contact fewer prospects while getting a better response rate. Mailing
to only 10,000 prospects, or the top 10 percent of the prospect list, achieves a
lift of 3. This turns the underlying response rate of 1 percent into a response
rate of 3 percent. In this scenario, 3,000 people respond yielding revenue of
$132,000. There are now 97,000 people who fail to respond and each of them
costs one dollar. The resulting profit is $35,000. Better still, SAC has $200,000
left in the marketing budget to use on another campaign or to improve the
offer made in this one, perhaps increasing response still more.

Table 4.4 Lift and Cumulative Gains by Decile


0% 0% 0% 0

10% 30% 30% 3.000

20% 20% 50% 2.500

30% 15% 65% 2.167

40% 13% 78% 1.950

50% 7% 85% 1.700

60% 5% 90% 1.500

70% 4% 94% 1.343

80% 4% 96% 1.225

90% 2% 100% 1.111

100% 0% 100% 1.000
104 Chapter 4

A smaller, better-targeted campaign can be more profitable than a larger and
more expensive one. Lift increases as the list gets smaller, so is smaller always
better? The answer is no because the absolute revenue decreases as the num­
ber of responders decreases. As an extreme example, assume the model can
generate lift of 100 by finding a group with 100 percent response rate when the
underlying response rate is 1 percent. That sounds fantastic, but if there are
only 10 people in the group, they are still only worth $440. Also, a more realis­
tic example would include some up-front fixed costs. Figure 4.3 shows what
happens with the assumption that there is a $20,000 fixed cost for the cam­
paign in addition to the cost of $1 per contact, revenue of $44 per response, and
an underlying response rate of 1 percent. The campaign is only profitable for a
small range of file penetrations around 10 percent.
Using the model to optimize the profitability of a campaign seems more
attractive than simply using it to pick whom to include on a mailing or call list
of predetermined size, but the approach is not without pitfalls. For one thing,
the results are dependent on the campaign cost, the response rate, and the rev­
enue per responder, none of which are known prior to running the campaign.
In the example, these were known, but in real life, they can only be estimated.
It would only take a small variation in any one of these to turn the campaign
in the example above completely unprofitable or to make it profitable over a
much larger range of deciles.

Profit by Decile



. 25
( 137 .)