. 102
( 137 .)


Fortunately, computers are sufficiently powerful that the question is more
about budget than possibility. Relational databases can also take advantage of
the most powerful hardware, parallel computers.
Online Analytic Processing (OLAP) is a powerful part of data warehousing.
OLAP tools are very good at handling summarized data, allowing users sum­
marize information along one or several dimensions at one time. Because these

systems are optimized for user reporting, they often have interactive response
times of less than 5 seconds.
Any well-designed OLAP system has time as a dimension, making it very
useful for seeing trends over time. Trying to accomplish the same thing on a
normalized data warehouse requires very complicated queries that are prone
to error. To be most useful, OLAP systems should allow users to drill down to
detail data for all reports. This capability ensures that all data is making it into
the cubes, as well as giving users the ability to spot important patterns that
may not appear in the dimensions.
As we have pointed out throughout this chapter, OLAP complements
data mining. It is not a substitute for it. It provides better understanding of data,
and the dimensions developed for OLAP can make data mining results more
actionable. However, OLAP does not automatically find patterns in data.
OLAP is a powerful way to distribute information to many end users for
advanced reporting needs. It provides the ability to let many more users base
their decisions on data, instead of on hunches, educated guesses, and personal
experience. OLAP complements undirected data mining techniques such as
clustering. OLAP can provide the insight needed to find the business value in
the identified clusters. It also provides a good visualization tool to use with
other methods, such as decision trees and memory-based reasoning.
Data warehousing and data mining are not the same thing; however, they
do complement each other, and data mining applications are often part of the
data warehouse solution.


Building the Data Mining
In the Big Rock Candy Mountains,
There™s a land that™s fair and bright,
Where the handouts grow on bushes
And you sleep out every night.
Where the boxcars all are empty
And the sun shines every day
And the birds and the bees
And the cigarette trees
The lemonade springs
Where the bluebird sings
In the Big Rock Candy Mountains.

Twentieth century hoboes had a vision of utopia, so why not twenty-first cen­
tury data miners? For us, the vision is one of a company that puts the customer
at the center of its operations and measures its actions by their effect on long-
term customer value. In this ideal organization, business decisions are based
on reliable information distilled from vast quantities of customer data. Need­
less to say, data miners”the people with the skills to turn all that data into the
information needed to run the company”are held in great esteem.
This chapter starts with a utopian vision of a truly customer-centric organi­
zation with the ideal data mining environment to produce the information on
which all decisions are based. Having a description of what the ideal data min­
ing environment would look like is helpful for establishing more realistic near
term goals. The chapter then goes on to look at the various components of the
data mining environment”the staff, the data mining infrastructure, and the
data mining software itself. Although we may not be able to achieve all ele­
ments of the utopian vision, we can use the vision to help create an environ­
ment suitable for successful data mining work.

514 Chapter 16

A Customer-Centric Organization

Despite the familiar clich© that the customer is king, in most companies cus­
tomers are not treated much like royalty. One reason is that most businesses
are not organized around customers; they are organized around products.
Supermarkets, for example, have long been able to track the inventory levels
of tens of thousands of products in order to keep the shelves well stocked, and
they are able to calculate the profit margin on any item. But, until recently,
these same stores knew nothing about individual customers”not their names,
nor how many trips per month they make, nor what time of day they tend to
shop, nor whether they use coupons, nor if they have children, nor what per­
cent of the household™s shopping is done in this store, nor how close they
live”nothing. We don™t mean to pick on supermarkets. Banks have been orga­
nized around loans; telephone companies have been organized around
switches; airlines have been organized around operations. None have known
much (or cared much) about customers.
In all of these industries, technology now makes it possible to shift the focus
to customers. Such a shift is not easy; in fact, it is nothing short of revolution­
ary. By combining point-of-sale scanner data with a loyalty card program, a
grocery retailer can, with a lot of effort, learn who is buying what and when
they buy it, which customers are price-sensitive and which ones like to try new
products, which ones like to bake from scratch and which ones prefer pre­
pared meals, and so on. A telephone company can figure out who is making
business calls and who is primarily chatting with friends. An online music
store can make individualized recommendations of new music.
The harder challenge is being able to make effective use of this new ability
to see customers in data. A truly customer-centric organization would be
happy to continue offering an unprofitable service if the customers who use
the loss-generating service spend more in other areas and therefore increase
the profitability of the company as a whole. A customer-centric company does
not have to ask the same questions every time a customer calls in. A customer-
centric company judges a marketing campaign on the value customers gener­
ate over their lifetimes rather than on the initial response rate.
Becoming truly customer-centric means changing the corporate culture and
the way everyone from top managers to call-center operators are rewarded. As
long as each product line has a manager whose compensation is tied to the
amount and margin of product sold, the company will remain focused on
products rather than customers. In other words, the company is paying its
managers to focus on products, and the managers are doing their jobs. In the
ideal customer-centric organization, everyone is rewarded for increasing cus­
tomer value and understands that this requires learning from each customer
Building the Data Mining Environment 515

interaction and the ability to use what has been learned to serve customers bet­
ter. As a result, the company records every interaction with its customers and
keeps an extensive historical record of these interactions.

An Ideal Data Mining Environment
The ideal context for data mining is an organization that appreciates the value
of information. Bringing together customer data from all of the many places
where it is originally collected and putting it into a form suitable for data min­
ing is a difficult and expensive process. It will only happen in an organization
that understands how valuable that data is once it can be properly exploited.
Information is power. A learning organization values progress and steady
improvement; such an organization wants and invests in accurate informa­
tion. Remember that the producers of information always have real power to
determine what data is available and when. They are not passive consumers of
a take-it-or-leave-it data warehouse, they have the power to determine what
data is available, although collecting such data might mean changing opera­
tional procedures.

The Power to Determine What Data Is Available
In the ideal data mining environment, the importance of data analysis is rec­
ognized and its results are shared across the organization. Marketing people
instinctively regard every campaign as a controlled experiment, even when
that means not including some customers in a promising campaign because
those customers are part of a control group. Designers of operational systems
instinctively keep track of all customer transactions, including nonbillable
ones such as customer service inquiries, bank account balance inquiries, or vis­
its to particular sections of the company Web site. Everyone expects that cus­
tomer interactions from different channels can be identified as involving the
same customer, even when some happen at an ATM, some in a bank branch,
some over the phone, and some on the Web.
In such an environment, an analyst at a telephone company trying to under­
stand the relationship between quality of wireless telephone service and churn
has no trouble getting customer-level data on dropped calls and other failures.
The analyst can also readily see a customer™s purchase history even though
some purchases were made in stores, some through the mail-order catalog,
and some on the Web. It is similarly easy to determine, for each of a customer™s
calls to customer service, the duration of the call and whether the call was
handled by a human representative or stayed in the IVR, and in the latter case,
what path was followed through the prompts. Best of all, when the required
516 Chapter 16

data is not readily available, there is a team of people whose job it is to make it
available”even when that means redesigning an application form, repro­
gramming an automated switch”or simply loading the data correctly in the
first place.

The Skills to Turn Data into Actionable Information
The ideal data mining environment is staffed by people whose superior skills
in data processing and data mining are only surpassed by their intimate
understanding of how the business operates and its goals for the future. The
data mining group includes database experts, programmers, statisticians, data
miners, and business analysts, all working together to ensure that business
decisions are based on accurate information. This team of people has the com­
munication skills to spread whatever they may learn to the appropriate parts
of the organization, whether that is marketing, operations, management, or

All the Necessary Tools
The ideal data mining environment includes sufficient computing power and
database resources to support the analysis of the most detailed level of cus­
tomer transactions. It includes software for manipulating all that data and cre­
ating model sets from it. And, of course, it includes a rich collection of data
mining software so that all the techniques from Chapters 5“13 can be applied.

Back to Reality
Readers will not be shocked to learn that we have never seen the ideal data
mining environment just described. We have, however, worked with many
companies that are moving in the right direction. These companies are taking
steps to transform themselves into customer-centric organizations. They are
building data mining groups. They are gathering customer data from opera­
tional systems and creating a single customer view. Many of them are already
reaping substantial benefits.

Building a Customer-Centric Organization
The first component of the utopian vision that opened the chapter was a truly
customer-centric organization. In terms of data, one of the hardest parts of
building a customer-centric organization is establishing a single view of the
customer shared across the entire enterprise that informs every customer
Building the Data Mining Environment 517

interaction. The flip side of this challenge is establishing a single image of the
company and its brand across all channels of communication with the cus­
tomer, including retail stores, independent dealers, the Web site, the call cen­
ters, advertising, and direct marketing. The goal is not only to make more
informed decisions; the goal is to improve the customer experience in a mea­
surable way. In other words, the customer strategy has both analytic and oper­
ational components. This book is more concerned with the analytic component,
but both are critical to success.

T I P Building a customer-centric organization requires a strategy with both
analytic and operational components. Although this book is about the

analytical component, the operational component is also critical.

Building a customer-centric organization requires centralizing customer
information from a variety of sources in a single data warehouse, along with a
set of common definitions and well-understood business processes describing
the source of the data. This combination makes it possible to define a set of cus­
tomer metrics and business rules used by all groups to monitor the business and
to measure the impact of changing market conditions and new initiatives.
The centralized store of customer information is, of course, the data ware­
house described in the previous chapter. As shown in Figure 16.1, there is two-
way traffic between the operational systems and the data warehouse.
Operational systems supply the raw data that goes into the data warehouse,
and the warehouse in turn supplies customer scores, decision rules, customer
segment definitions, and action triggers to the operational system. As an
example, the operational systems of a retail Web site capture all customer
orders. These orders are then summarized in a data warehouse. Using data
from the data warehouse, association rules are created and used to generate
cross-sell recommendations that are sent back to the operational systems. The
end result: a customer comes to the site to order a skirt and ends up with sev­
eral pairs of tights as well.

Creating a Single Customer View
Every part of the organization should have access to a single shared view of
the customer and present the customer with a single image of the company. In
practical terms that means sharing a single customer profitability model, a sin­
gle payment default risk model, a single customer loyalty model, and shared
definitions of such terms as customer start, new customer, loyal customer, and
valuable customer.
518 Chapter 16

(b Ope
ill ra
in tio
g, n
Operational us al
ag Da Business
Systems e, ta Users

Co egm


. 102
( 137 .)