. 93
( 137 .)


the customer acquisition process. When it fails, potentially valuable customers
are kept away.

Relationship Management
Once a prospect has become a customer, the goal is to increase the customer™s
value. This usually entails the following activities:
Data Mining throughout the Customer Life Cycle 467

Up-Selling. Having the customer buy premium products and services.
Cross-Selling. Broadening the customer relationship, such as having cus­
tomers buy CDs, plane tickets, and cars, in addition to books.
Usage Stimulation. Ensuring that the customer comes back for more, for
example, by ensuring that customers see more ads or uses their credit
card for more purchases.
These three activities are very amenable to data mining, particularly predic­
tive modeling that can determine which customers are the best targets for
which messages. This type of predictive modeling often determines the course
of action for customers, as discussed in Chapter 3. However, there is a chal­
lenge of providing customers the right marketing messages, without inundat­
ing them with too many or contradictory messages.
Although telephone calls and mail solicitations are bothersome, unwanted
email messages (often called spam) tend to have a more negative effect on the
customer relationship. One reason may be that customers are often paying for
their Internet connection or for the disk space for email. Another reason may
be that this mail may arrive at work, rather than at home. Then there is the
problem of spam that includes annoying pop-up ads. And, of course, such
email has often been quite unsolicited, offending people who do not want to
receive solicitations for gambling, money laundering, Viagra, sex sites, debt
reduction, illegal pyramid marketing schemes, and the like.
Because email is abused so often, even legitimate companies who are com­
municating with bona fide customers run the risk of being associated with the
dubious ones. This is a danger, and in fact suggests that customer contact
needs to be broader than email.
Another danger for companies that offer many products and services is get­
ting the right message across. Customers do not necessarily want choice; cus­
tomers simply want you to provide what they want. Making customers find
the one thing that interests them in a barrage of marketing communication
does not do a good job of getting the message across. For this reason, it is use­
ful to focus messages to each customer on a small number of products that are
likely to interest that customer. Of course, each customer has a different poten­
tial set. Data mining plays a key role here in finding these associations.

Customer retention is one of the areas where predictive modeling is applied
most often. There are two approaches for looking at customer retention. The
first is the survival analysis approach described in Chapter 12, which attempts
to understand customer tenure. Survival analysis assigns a probability that a
customer is going to leave after some period of time.
468 Chapter 14


Forecasting customer stops and customer levels plays an important role in
businesses, particularly for planning future budgets and marketing endeavors.
A forecast provides an expect value (or set of expected values), that can be
used for comparing what actually happened to what was expected. This is a
natural application of data mining, particularly survival analysis.
The following figure shows what a forecasting engine looks like.

Existing New Start
Customer Forecast

Do Existing Base Do New Start
Forecast (EBF) Forecast (NSF)

Existing Customer New Start
Base Forecast Forecast
Do Existing Base
Do New Start Churn
Churn Forecast
Forecast (NSCF)

Churn Churn
Forecast Actuals


A forecasting engine uses data mining to predict customer levels (and hence churn)
as well a providing explanations in the form of deviations from the expected.

There are five important inputs:

Effective Date. All numbers before this date are actuals; all numbers
after this date are forecasts.
Forecast Dimensions. These are attributes of customers, such as
product, geography, and the channel used for developing the forecast.
New Starts. This is a list of new starts broken down by the forecast
dimensions after the effective date.
Active Customers. This is a list of all customers active on the effective
date, including the forecast dimensions for each customer.
Actual Churn. These are actual stops broken into forecast dimensions;
these are used for comparisons for explanatory purposes. This is not
available when the forecast is being developed, but is used later.
Data Mining throughout the Customer Life Cycle 469

The forecast is then broken into the following pieces. The existing base
forecast (EBF) determines the probability of each active customer being active
on given dates in the future; this forecast is a direct application of survival
analysis. The new start forecast (NSF) determines the contribution to the future
base from new starts. That is, these are the new starts who are active on future
dates. This is a direct application of survival analysis with a twist, because
every day, new customers are starting: NSF(t) = One Day Survival of NSF(t “ 1 )
+ New Starts(t).
The churn forecast is easily derived from the EBF and NSF. The existing base
churn forecast (EBCF) is the number of churners on a given day in the future
from the existing base. This is the difference in survival on successive days:
EBCF(t) = EBF(t) “ EBF(t + 1). The new start churn forecast (NSCF) is the number
of churners on a given day in the future from the new starts. This is a little
trickier to calculate, because we have to take into account new starts: NSCF(t) =
NSF(t “ 1) “ One Day Survival of NSF(t “ 1). The churn forecast is the sum of
these, CF(t) = EBCF(t) + NSCF(t).
All of the pieces of the forecast typically use forecast dimensions. The result
is that the forecast can be compared to actuals, making it possible to explain
the results in terms understandable and useful to the business.

The power of survival analysis is that it focuses on what is often the most
important determinant of retention, customer tenure. Customers who have
been around for a long time are usually more likely to stay around longer.
However, survival analysis can also take into account other factors, through
several enhancements to the basic technique. When there is a lot of data, dif­
ferent factors can be investigated independently, using a process called stratifi­
cation. When there are many other factors, then parametric modeling and
proportional hazards modeling provides a similar capability (these are not dis­
cussed in detail in this book). In either case, it is possible to get an idea of cus­
tomers™ remaining tenures. This is useful not only for retention interventions,
but also for customer lifetime value calculations and for forecasting numbers
of customers, as discussed in the sidebar “An Engine for Churn Forecasting.”
An alternative approach is to predict who is going to leave for some small
amount of time in the future. This is more of a traditional predictive modeling
problem, where we are looking for patterns in similar data from the past. This
approach is particularly useful for focused marketing interventions. Knowing
who is going leave in the near future makes the marketing campaign more
focused, so more money can be invested in saving each customer.
470 Chapter 14

Once customers have left, there is still the possibility that they can be lured
back. Winback tries to bring back valuable customers, by providing them with
incentives, products, and pricing promotions.
Winback tends to depend more on operational strategies than on data analy­
sis. Sometimes it is possible to determine why customers left. However, the
winback strategies need to begin as part of the retention efforts themselves.
Some companies, for instance, have specialized “save teams.” Customers can­
not leave without talking to a person who is trained in trying to retain them. In
addition to saving customers, save teams also do a good job of tracking the
reasons why customers are leaving”information that can be very valuable to
future efforts to keep customers.
Data analysis can sometimes help determine why customers are leaving,
particularly when customer service complaints can be incorporated into oper­
ational data. However, trying to lure back disgruntled customers is quite hard.
The more important effort is trying to keep them in the first place with com­
petitive products, attractive offers, and useful services.

Lessons Learned
Customers, in all their forms, are central to business success. Some are big and
very important; these merit specialized relationships. Others are small and
very numerous. This is the sweet spot for data mining, because data mining
can help provide mass intimacy where it is too expensive to have personal
relationships with everyone all the time. Some are in between, requiring a bal­
ance between these approaches.
Subscription-based relationships are a good model for customer relation­
ships in general because there is a well-defined beginning and end to the
relationship. Each customer has his or her own life cycle defined by events”
marriage, graduation, children, moving, changing jobs, and so on. These can
be useful for marketing, but suffer from the problem that companies do not
know when they occur.
The customer life cycle, in contrast, looks at customers from the perspective
of their business relationship. First, there are prospects, who are activated to
become new customers. New customers offer opportunities for up-selling,
cross-selling, and usage stimulation. Eventually all customers leave, making
retention an important data mining application both for marketing and fore­
casting. And once customers have left, they may be convinced to return
through winback strategies. Data mining can enhance all these business
Data Mining throughout the Customer Life Cycle 471

As more of the world is technology-driven, more and more data is available,
particularly about customer behavior. Data mining seeks to use all this data to
advantage, by summarizing data and applying algorithms that produce mean­
ingful results even on large data sets.
In the midst of all this technology, though, the customer relationship still
maintains its central position. After all, customers”because they provide
revenue”are the one thing that businesses need to remain successful, year
after year. Eventually, other funding sources dry up. No computer ever made
a purchase from Amazon; no software ever paid for a Pez dispenser on eBay;
no cell phone ever made an airline or restaurant reservation. There are always
people, individually or collectively, on the other end.


Data Warehousing, OLAP,
and Data Mining

Since the introduction of computers into data processing centers in the 1960s,
just about every operational system in business has been computerized. These
automated systems run companies, spewing out large amounts of data along
the way. This automation has changed how we do business and how we live:
ATM machines, adjustable rate mortgages, just-in-time inventory control,
online retailing, credit cards, Google, overnight deliveries, and frequent
flier/buyer clubs are a few examples of how computer-based automation has
opened new markets and revolutionized existing ones. This is not a new story;
it has been going on for decades.
In a typical company, such systems create vast amounts of data spread
through scads of disparate systems, from general ledgers to sales force
automation systems, from inventory control to electronic data interchange
(EDI), and so on. Data about specific parts of a business is there”lots and lots
of data, somewhere, in some form. Data is available but not information”and
not the right information at the right time. The goal of data warehouses is to make
the right information available at the right time. Data warehousing is the
process of bringing together disparate data from throughout an organization
for decision-support purposes.
A data warehouse serves as a decision-support system of record, making it
possible to reconcile reports because they have the same underlying source.
Such a system not only reduces the need to explain disparate results, but also
provides consistent views of the business across business units and time. We

474 Chapter 15

believe that, over time, informed decisions lead to better bottom-line results
over time, and data warehouses help managers make informed decisions.
Decision support, as used here, is an intentionally ambiguous concept. It can
be as rudimentary as getting production reports to front-line managers every
week. It can be as complex as sophisticated modeling of prospective customers
using neural networks to determine which message to offer. It can be and is
just about everything in between.
Data warehousing is a natural ally of data mining. Data mining seeks to find
actionable patterns in data and therefore has a firm requirement for clean and
consistent data. Much of the effort behind data mining endeavors is in the


. 93
( 137 .)