. 26
( 137 .)


0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%







Figure 4.3 Campaign profitability as a function of penetration.
Data Mining Applications 105


20% down
20% up

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%






Figure 4.4 A 20 percent variation in response rate, cost, and revenue per responder has a
large effect on the profitability of a campaign.

Figure 4.4 shows what would happen to this campaign if the assumptions
on cost, response rate, and revenue were all off by 20 percent. Under the pes­
simistic scenario, the best that can be achieved is a loss of $20,000. Under the
optimistic scenario, the campaign achieves maximum profitability of $161,696
at 40 percent penetration. Estimates of cost tend to be fairly accurate since they
are based on postage rates, printing charges, and other factors that can be
determined in advance. Estimates of response rates and revenues are usually
little more than guesses. So, while optimizing a campaign for profitability
sounds appealing, it is unlikely to be possible in practice without conducting
an actual test campaign. Modeling campaign profitability in advance is
primarily a what-if analysis to determine likely profitability bounds based on
various assumptions. Although optimizing a campaign in advance is not par­
ticularly useful, it can be useful to measure the results of a campaign after it
has been run. However, to do this effectively, there need to be customers
included in the campaign with a full range of response scores”even cus­
tomers from lower deciles.

WA R N I N G The profitability of a campaign depends on so many factors that
can only be estimated in advance that the only reliable way to do it is to use an
actual market test.
106 Chapter 4

Reaching the People Most Influenced by the Message
One of the more subtle simplifying assumptions made so far is that when a
model with good lift is identifying people who respond to the offer. Since these
people receive an offer and proceed to make purchases at a higher rate than
other people, the assumption seems to be confirmed. There is another possi­
bility, however: The model could simply be identifying people who are likely
to buy the product with or without the offer.
This is not a purely theoretical concern. A large bank, for instance, did a
direct mail campaign to encourage customers to open investment accounts.
Their analytic group developed a model for response for the mailing. They
went ahead and tested the campaign, using three groups:
Control group: A group chosen at random to receive the mailing.

Test group: A group chosen by modeled response scores to receive the

Holdout group: A group chosen by model scores who did not receive the

The models did quite well. That is, the customers who had high model
scores did indeed respond at a higher rate than the control group and cus­
tomers with lower scores. However, customers in the holdout group also
responded at the same rate as customers in the test group.
What was happening? The model worked correctly to identify people inter­
ested in such accounts. However, every part of the bank was focused on get­
ting customers to open investment accounts”broadcast advertising, posters
in branches, messages on the Web, training for customer service staff. The
direct mail was drowned in the noise from all the other channels, and turned
out to be unnecessary.

T I P To test whether both a model and the campaign it supports are effective,
track the relationship of response rate to model score among prospects in a

holdout group who are not part of the campaign as well as among prospects

who are included in the campaign.

The goal of a marketing campaign is to change behavior. In this regard,
reaching a prospect who is going to purchase anyway is little more effective
than reaching a prospect who will not purchase despite having received the
offer. A group identified as likely responders may also be less likely to be influ­
enced by a marketing message. Their membership in the target group means
that they are likely to have been exposed to many similar messages in the past
from competitors. They are likely to already have the product or a close sub­
stitute or to be firmly entrenched in their refusal to purchase it. A marketing
message may make more of a difference with people who have not heard it all
Data Mining Applications 107

before. Segments with the highest scores might have responded anyway, even
without the marketing investment. This leads to the almost paradoxical con­
clusion that the segments with the highest scores in a response model may not
provide the biggest return on a marketing investment.

Differential Response Analysis
The way out of this dilemma is to directly model the actual goal of the cam­
paign, which is not simply reaching prospects who then make purchases. The
goal should be reaching prospects who are more likely to make purchases
because of having been contacted. This is known as differential response analysis.
Differential response analysis starts with a treated group and a control
group. If the treatment has the desired effect, overall response will be higher in
the treated group than in the control group. The object of differential response
analysis is to find segments where the difference in response between the
treated and untreated groups is greatest. Quadstone™s marketing analysis soft­
ware has a module that performs this differential response analysis (which
they call “uplift analysis”) using a slightly modified decision tree as illustrated
in Figure 4.5.
The tree in the illustration is based on the response data from a test mailing,
shown in Table 4.5. The data tabulates the take-up rate by age and sex for an
advertised service for a treated group that received a mailing and a control
group that did not.
It doesn™t take much data mining to see that the group with the highest
response rate is young men who received the mailing, followed by old men
who received the mailing. Does that mean that a campaign for this service
should be aimed primarily at men? Not if the goal is to maximize the number
of new customers who would not have signed up without prompting. Men
included in the campaign do sign up for the service in greater numbers than
women, but men are more likely to purchase the service in any case. The dif­
ferential response tree makes it clear that the group most affected by the cam­
paign is old women. This group is not at all likely (0.4 percent) to purchase the
service without prompting, but with prompting they experience a more than
tenfold increase in purchasing.

Table 4.5 Response Data from a Test Mailing


women 0.8% 0.4% 4.1% (‘3.3) 4.6% (‘4.2)

men 2.8% 3.3% 6.2% (‘3.4) 5.2% (‘1.9)
108 Chapter 4



Difference in response Objective: Respond
between the groups

Uplift = +3.2% of 49,873
& 50,127 Group
Female Male

+2.6% of 24,773
+3.8% of 25,100
& 24,912
& 25,215
Age Age
Young Old Young Old
3.4% of 12,321 1.9% of 12,452
+4.2% of 12,747
+3.3% of 12,353
& 12,158 & 12,754
& 12,836
& 12,379
#3 #4 #5 #6

Difference in response

between the groups

Figure 4.5 Quadstone™s differential response tree tries to maximize the difference in
response between the treated group and a control group.

Using Current Customers to Learn About Prospects
A good way to find good prospects is to look in the same places that today™s best
customers came from. That means having some of way of determining who the
best customers are today. It also means keeping a record of how current cus­
tomers were acquired and what they looked like at the time of acquisition.
Of course, the danger of relying on current customers to learn where to look
for prospects is that the current customers reflect past marketing decisions.
Studying current customers will not suggest looking for new prospects any­
place that hasn™t already been tried. Nevertheless, the performance of current
customers is a great way to evaluate the existing acquisition channels. For
prospecting purposes, it is important to know what current customers looked
like back when they were prospects themselves. Ideally you should:
Start tracking customers before they become customers.

Gather information from new customers at the time they are acquired.

Model the relationship between acquisition-time data and future out­

comes of interest.
The following sections provide some elaboration.
Data Mining Applications 109

Start Tracking Customers before
They Become Customers
It is a good idea to start recording information about prospects even before
they become customers. Web sites can accomplish this by issuing a cookie each
time a visitor is seen for the first time and starting an anonymous profile that
remembers what the visitor did. When the visitor returns (using the same
browser on the same computer), the cookie is recognized and the profile is


. 26
( 137 .)