<<

. 116
( 137 .)



>>





Aug
Payment
$2,500 A typical transactor pays off
Minimum
the bill every month. The
$2,000
payment is typically much
larger than the minimum
$1,500
payment, except in months
$1,000 with few charges.
$500
This transactor has an
average balance of $1,196.
$0
Jul
Mar
Jan




Jun
Feb




Sep




Dec
Oct

Nov
May
Apr




Aug




Payment
A typical convenience user
$2,000
Minimum
uses the card when
necessary and pays off the
$1,500
balance over several
$1,000
months.
$500
This convenience user has
an average balance of $524.
$0
Jul
Mar
Jan




Jun
Feb




Sep




Dec
Oct

Nov
May
Apr




Aug




Figure 17.16 These three charts show actual and minimum payments for three credit card
customers with a credit line of $2,000.


Manually looking at shapes is an inefficient way to categorize the behavior
of several million customers. Shape is a vague, qualitative notion. What is
needed is a score. One way to create a score is by looking at the area between
the “minimum payment” curve and the actual “payment” curve. For our pur­
poses, the area is the sum of the differences between the payment and the min­
imum. For the revolver, this sum is $112; for the convenience user, $559.10; and
for the transactor, a whopping $13,178.90.
Preparing Data for Mining 587


This score makes intuitive sense. The lower it is, the more the customer
looks like a revolver. However, the score does not work for comparing two
cardholders with different credit lines. Consider an extreme case. If a card­
holder has a credit line of $100 and was a perfect transactor, then the score
would be no more than $1,200. And yet an imperfect revolver with a credit line
of $2,000 has a much larger score.
The solution is to normalize the value by dividing each month™s difference
by the total credit line. Now, the three scores are 0.0047, 0.023, and 0.55, respec­
tively. When the normalized score is close to 0, the cardholder is close to being
a perfect revolver. When it is close to 1, the cardholder is close to being a per­
fect transactor. Numbers in between represent convenience users. This pro­
vides a revolver-transactor score for each customer, with convenience users
falling in the middle.
This score for customer behavior has some interesting properties. Someone
who never uses their card would have a minimum payment of 0 and an actual
payment of 0. These people look like revolvers. That might not be a good
thing. One way to resolve this would be to include the estimated revenue
potential with the behavior score, in effect, describing the behavior using two
numbers.
Another problem with this score is that as the credit line increases, a customer
looks more and more like a revolver, unless the customer charges more. To get
around this, the ratios could instead be the monthly balance to the credit line.
When nothing is owed and nothing paid, then everything has a value of 0.
Figure 17.17 shows a variation on this. This score uses the ratio of the
amount paid to the minimum payment. It has some nice features. Perfect
revolvers now have a score of 1, because their payment is equal to the mini­
mum payment. Someone who does not use the card has a score of 0. Transac­
tors and convenience users both have scores higher than 1, but it is hard to
differentiate between them.
This section has shown several different ways of measuring the behavior of
a customer. All of these are based on the important variables relevant to the
customer and measurements taken over several months. Different measures
are more valuable for identifying various aspects of behavior.

The Ideal Convenience User
The measures in the previous section focused on the extremes of customer
behavior, as typified by revolvers and transactors. Convenience users were
just assumed to be somewhere in the middle. Is there a way to develop a score
that is optimized for the ideal convenience user?
588 Chapter 17


120

Payment as Multiple of Min Payment
100



80


60


TRANSACTOR
40
CONVENIENCE
REVOLVER
20



0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Figure 17.17 Comparing the amount paid as a multiple of the minimum payment shows
distinct curves for transactors, revolvers, and convenience users.


First, let™s define the ideal convenience user. This is someone who, twice a
year, charges up to his or her credit line and then pays the balance off over 4
months. There are few, if any, additional charges during the other 10 months of
the year. Table 17.7 illustrates the monthly balances for two convenience users
as a ratio of their credit lines.
This table also illustrates one of the main challenges in the definition of con­
venience users. The values describing their behavior have no relationship to
each other in any given month. They are out of phase. In fact, there is a funda­
mental difference between convenience users on the one hand and transactors
and revolvers on the other. Knowing that someone is a transactor exactly
describes their behavior in any given month”they pay off the balance. Know­
ing that someone is a convenience user is less helpful. In any given month, they
may be paying nothing, paying off everything, or making a partial payment.

Table 17.7 Monthly Balances of Two Convenience Users Expressed as a Percentage of
Their Credit Lines

JAN FEB MAR APR MAY JUN JUL AUG SEP NOV DEC

Conv1 80% 60% 40% 20% 0% 0% 0% 60% 30% 15% 70%

Conv2 0% 0% 83% 50% 17% 0% 67% 50% 17% 0% 0%
Preparing Data for Mining 589


Does this mean that it is not possible to develop a measure to identify con­
venience users? Not at all. The solution is to sort the 12 months of data by the
balance ratio and to create the convenience-user measure using the sorted
data.
Figure 17.18 illustrates this process. It shows the two convenience users,
along with the profile of the ideal convenience user. Here, the data is sorted,
with the largest values occurring first. For the first convenience user, month 1
refers to January. For the second, it refers to March.
Now, using the same idea of taking the area between the ideal and the actual
produces a score that measures how close a convenience user is to the ideal.
Notice that revolvers would have outstanding balances near the maximum for
all months. They would have high scores, indicating that they are far from the
ideal convenience user. For convenience users, the scores are much smaller.
This case study has shown several different ways of segmenting customers.
All make use of derived variables to describe customer behavior. Often, it is
possible to describe a particular behavior and then to create a score that mea­
sures how each customer™s behavior compares to the ideal.


100%

90%
Ratio of Balance to Credit Line




80%

70%
IDEAL TRANSACTOR
60%
IDEAL CONVENIENCE
50%
CONVENIENCE 2
40% CONVENIENCE 1

30%

20%

10%

0%
1 2 3 4 5 6 7 8 9 10 11 12
Month (Sorted from Highest Balance to Lowest)
Figure 17.18 Comparison of two convenience users to the ideal, by sorting the months by
the balance ratio.
590 Chapter 17


The Dark Side of Data

Working with data is a critical part of the data mining process. What does the
data mean? There are many ways to answer this question”through written
documents, in database schemas, in file layouts, through metadata systems,
and, not least, via the database administrators and systems analysis who know
what is really going on. No matter how good the documentation, the real story
lies in the data.
There is a misconception that data mining requires perfect data. In the
world of business analysis, the perfect is definitely the enemy of the suffi­
ciently good. For one thing, exploring data and building models highlights
data issues that are otherwise unknown. Starting the process with available

<<

. 116
( 137 .)



>>