. 121
( 137 .)


Control Group

Entire Subscriber Base

Figure 18.2 Study design for the analytic customer relationship marketing test.

In the end, the outbound telemarketing company simply called people from
the test and control groups and asked them a series of questions designed to
elicit their level of satisfaction and volunteered to refer any problems reported
to customer service. Despite this rather lame intervention, 60-day retention
was significantly better for the test group than for the control group. Appar­
ently, just showing that the company cared enough to call was enough to
decrease churn.

The Data
In the course of several interviews with the client, we identified two sources of
data for use in the pilot. The first source was a customer profile database that
had already been set up by a database marketing company. This database con­
tained summary information for each subscriber including the billing plan,
type of phone, local minutes of use by month, roaming minutes of use by
month, number of calls to and from each identified cellular market in the
United States, and dozens of other similar fields.
The second source was call detail data collected from the wireless switches.
Each time a mobile phone is switched on, it begins a two-way conversation
with nearby cell sites. The cell sites relay data from the telephone such as the
612 Chapter 18

serial number and phone type to a central switching office. Computers at the
switching office figure out which cell site the phone should be talking to at the
moment and send a message back to the phone telling it which cell it is using
and what frequency to tune to.
When the subscriber enters a phone number and presses the send button,
the number is relayed to the central switch, which in turn sets up the call over
regular land lines or relays it to the cell closest to another wireless subscriber.
Every switch generates a call detail record that includes the subscriber ID, the
originating number, the number called, the originating cell, the call duration,
the call termination reason, and so on. These call detail records were used to
generate a behavioral profile of each customer, including such things as the
number of distinct numbers called and the proportion of calls by time of day
and day of week.

The pilot project used 6 months of data for around 50,000 subscribers some

of whom canceled their accounts and some of whom did not. Our original
intention was to merge the two data sources so that a given subscriber™s data
from the marketing database (billing plan, tenure, type of phone, total minutes
of use, home town, and so on) would be linked to the detail records for each of
his or her calls. That way, a single model could be built based on independent
variables from both sources. For technical reasons, this proved difficult, so due

to time and budgetary constraints we ended up building two separate models,
one based on the marketing data and one based on call detail data.
The marketing data was already summarized at the customer level and
stored in an easily accessible database system. Getting the call detail data into a
usable form was more challenging. Each switch had its own collection of reel-
to-reel tapes like the ones used to represent computers in 1960s movies. These
tapes were continuously recycled so that a 90-day moving window was always
current with the tapes from 90 days earlier being used to record the current
day™s calls. Since eight tapes were written every day, we found ourselves look­
ing at over 700 tape reels, each of which had to be loaded individually by hand
into a borrowed 9-track tape drive. Once loaded, the call detail data, which was
written in an arcane format unique to the switching equipment, needed exten­
sive preprocessing in order to be made ready for analysis. The 70 million call
detail records were reduced to 10 million by filtering out records that did not
relate to calls to or from the churn model population of around.
Even before predictive modeling began, simple profiling of the call detail
data suggested many possible avenues for increasing profitability. Once call
detail was available in a queryable form, it became possible to answer ques­
tions such as:
Are subscribers who make many short calls more or less loyal than


those who make fewer, longer calls?

Do dropped calls lead to calls to customer service?


Putting Data Mining to Work 613

What is the size of a subscriber™s “calling circle” for both mobile-to-

mobile and mobile-to-fixed-line calling?
How does a subscriber™s usage vary from hour to hour, month to

month, and weekday to weekend?
Does the subscriber call any radio station call-in lines?

How often does a subscriber call voice mail?

How often does a subscriber call customer service?

The answers to these and many other questions suggested a number of mar­
keting initiatives to stimulate cellular phone use at particular times and in par­
ticular ways. Furthermore, as we had hoped, variables built around measures
constructed from the call detail, such as size of calling circle, proved to be
highly predictive of churn.

The Findings
Data mining isolated several customer segments at high risk for churn. Some
of these were more actionable than others. For example, it turned out that sub­
scribers who, judging by where their calls entered the network, commuted to
New York were much more likely to churn than subscribers who commuted to
Philadelphia. This was a coverage issue. Customers who lived in the Comcast
coverage area and commuted to New York, found themselves roaming (mak­
ing use of another company™s network) for most of every work day. The billing
plans in effect at that time made roaming very expensive. Commuters to
Philadelphia remained within the Comcast coverage area for their entire com­
mute and work day and so incurred no roaming charges. This problem was
not very actionable because neither changing the coverage area nor changing
the rules governing rate plans was within the power of the sponsors of the
study, although the information could be used by other parts of the business.
A potentially more actionable finding was that customers whose calling pat­
terns did not match their rate plan were at high risk for churn. There are two
ways that a customer™s calling behavior may be inappropriate for his or her
rate plan. One segment of customers pays for more minutes than they actually
use. Arguably, a wireless company might be able to increase the lifetime value
of these customers by moving them to a lower rate plan. They would be worth
less each month, but might last longer. The only way to find out for sure would
be with a marketing test. After all, customers might accept the offer to pay less
each month, but still churn at the same rate. Or, the rate of churn might be low­
ered, but not enough to make up for the loss in near-term revenue.
The other type of mismatch between calling behavior and rate plan occurs
when subscribers sign up for a low-cost rate plan that does not include many
minutes of use and find themselves frequently using more minutes than the
614 Chapter 18

plan allows. Since the extra minutes are charged at a high rate, these customers
end up paying higher bills than they would on a more expensive rate plan
with more included minutes. Moving these customers to a higher-rate plan
would save them some money, while also increasing the amount of revenue
from the fixed portion of their monthly bill.

The Proof of the Pudding
Comcast was able to make a direct cost/benefit analysis of the combined data
mining and telemarketing action plan. Armed with this data, Comcast was
able to make an informed decision to invest in future data mining efforts. Of
course, the story does not really end there; it never does.
The company was faced with a whole new set of questions based on the data
that comes back from the initial study. New hypotheses were formed and
tested. The response data from the telemarketing effort became fodder for a
new round of knowledge discovery. New product ideas and service plans were
tried out. Each round of data mining started from a higher base because the
company knew its customers better. That is the virtuous cycle of data mining.

Lessons Learned
In a business context, the successful introduction of data mining requires using
data mining techniques to address a real business challenge. For companies
that are just getting started with analytical customer relationship manage­
ment, integrating data mining can be a daunting task. A proof-of-concept proj­
ect is a good way to get started. The proof of concept should create a solid
business case for further integration of data mining into the company™s mar­
keting, sales, and customer-support operations. This means that the project
should be in an area where it is easy to link improved understanding gained
through data mining with improved profitability.
The most successful proof-of-concept projects start with a well-defined busi­
ness problem, and use data related to that problem to create a plan of action.
The action is then carried out in a controlled manner and the results carefully
analyzed to evaluate the effectiveness of the action taken. In other words, the
proof of concept should involve one full trip around the virtuous cycle of data
mining. If this initial project is successful, it will be the first of many. The pri­
mary lesson from this chapter is also an important lesson of the book as a
whole: data mining techniques only become useful when applied to meaning­
ful problems. Data mining is a technical activity that requires technical exper­
tise, but its success is measured by its effect on the business.

adjusted error rates, CART

algorithm, 185

absolute values, distance function, 275

advertising. See also marketing



classification and predication, 79

communication channels,

estimation, 79“81

prospecting, 89

prospects, 90“94

acquisition-time data, 108“110
word-of-mouth, 283

customer relationships, 461“464
affinity grouping


association rules, 11

actionable data, 516

business goals, formulating, 605

actionable results, 22

cross-selling opportunities, 11

actionable rules, association rules, 296

data transformation, 57

control group response versus

undirected data mining, 57

market research response, 38

affordability, server platforms, 13

taking control of, 30

agglomerative clustering, automatic

activation function, neural

cluster detection, 368“370

networks, 222

aggregation, confusion and, 48

acuity of testing, statistical analysis,


. 121
( 137 .)