<<

. 5
( 137 .)



>>


Searching for Islands of Simplicity 350

Star Light, Star Bright 351

Fitting the Troops 352

K-Means Clustering 354

Three Steps of the K-Means Algorithm 354

What K Means 356

Similarity and Distance 358

Similarity Measures and Variable Type 359

Formal Measures of Similarity 360

Geometric Distance between Two Points 360

Angle between Two Vectors 361

Manhattan Distance 363

Number of Features in Common 363

Data Preparation for Clustering 363

Scaling for Consistency 363

Use Weights to Encode Outside Information 365

Other Approaches to Cluster Detection 365

Gaussian Mixture Models 365

Agglomerative Clustering 368

An Agglomerative Clustering Algorithm 368

Distance between Clusters 368

Clusters and Trees 370

Clustering People by Age: An Example of

Agglomerative Clustering 370

Divisive Clustering 371

Self-Organizing Maps 372

Evaluating Clusters 372

Inside the Cluster 373

Outside the Cluster 373

Case Study: Clustering Towns 374

Creating Town Signatures 374

The Data 375

Creating Clusters 377

Determining the Right Number of Clusters 377

Using Thematic Clusters to Adjust Zone Boundaries 380

Lessons Learned 381

xiv Contents


Chapter 12
Knowing When to Worry: Hazard Functions and

Survival Analysis in Marketing 383

Customer Retention 385

Calculating Retention 385

What a Retention Curve Reveals 386

Finding the Average Tenure from a Retention Curve 387

Looking at Retention as Decay 389

Hazards 394

The Basic Idea 394

Examples of Hazard Functions 397

Constant Hazard 397

Bathtub Hazard 397

A Real-World Example 398

Censoring 399

Other Types of Censoring 402

From Hazards to Survival 404

Retention 404

Survival 405

Proportional Hazards 408

Examples of Proportional Hazards 409

Stratification: Measuring Initial Effects on Survival 410

Cox Proportional Hazards 410

Limitations of Proportional Hazards 411

Survival Analysis in Practice 412

Handling Different Types of Attrition 412

When Will a Customer Come Back? 413

Forecasting 415

Hazards Changing over Time 416

Lessons Learned 418

Chapter 13
Genetic Algorithms 421

How They Work 423

Genetics on Computers 424

Selection 429

Crossover 430

Mutation 431

Representing Data 432

Case Study: Using Genetic Algorithms for

Resource Optimization 433

Schemata: Why Genetic Algorithms Work 435

More Applications of Genetic Algorithms 438

Application to Neural Networks 439

Case Study: Evolving a Solution for Response Modeling 440

Business Context 440

Data 441

The Data Mining Task: Evolving a Solution 442

Beyond the Simple Algorithm 444

Lessons Learned 446

Contents xv


Chapter 14 Data Mining throughout the Customer Life Cycle
447
Levels of the Customer Relationship
448

Deep Intimacy
449

Mass Intimacy
451

In-between Relationships
453

Indirect Relationships
453

Customer Life Cycle
454

The Customer™s Life Cycle: Life Stages
455

Customer Life Cycle
456

Subscription Relationships versus Event-Based Relationships 458

Event-Based Relationships 458

Subscription-Based Relationships 459

Business Processes Are Organized around

the Customer Life Cycle 461

Customer Acquisition 461

Who Are the Prospects? 462

When Is a Customer Acquired? 462

What Is the Role of Data Mining? 464

Customer Activation 464

Relationship Management 466

Retention 467

Winback 470

Lessons Learned 470

Chapter 15 Data Warehousing, OLAP, and Data Mining 473

The Architecture of Data 475

Transaction Data, the Base Level 476

Operational Summary Data 477

Decision-Support Summary Data 477

Database Schema 478

Metadata 483

Business Rules 484

A General Architecture for Data Warehousing 484

Source Systems 486

Extraction, Transformation, and Load 487

Central Repository 488

Metadata Repository 491

Data Marts 491

Operational Feedback 492

End Users and Desktop Tools 492

Analysts 492

Application Developers 493

Business Users 494

Where Does OLAP Fit In? 494

What™s in a Cube? 497

<<

. 5
( 137 .)



>>