<<

. 128
( 137 .)



>>

statistics, 160“161

functionality, lack of, data

exploration tools, decision trees as,

transformation, 28

203“204

functions

exponential decay, retention,

activation, 222

389“390, 393

CHIDIST, 152

expressive power, descriptive

combination

models, 78

attrition history, 280

extraction, transformation, and load

MBR (memory-based reasoning),

(ETL) tools, 487, 595

258, 265

F neural networks, 272

F tests (Ronald A. Fisher), 183“184 weighted voting, 281“282

fax machines, link analysis, 337“341 density, 133

Federal Express, transaction distance

processing systems, 3“4 defined, 271“272

feedback
discussed, 258, 265

change processes, 34
hidden distance fields, 278

operational, 485, 492
identity distance, 271

relevance feedback, MBR, 267“268
numeric fields, 275

feed-forward neural networks
triangle inequality, 272

back propagation, 228“232
zip codes, 276“277

hidden layer, 227
hyperbolic tangent, 223

input layer, 226
NORMDIST, 134

output layer, 227
NORMSINV, 147

field values, statistics, 128
sigmoid, 225

Fisher, Ronald A. (F tests), 183“184
summation, 272

fixed budgets, marketing campaigns,
tangent, 223

97“100
transfer, 223

fixed positions, generic algorithms, 435
future attrition, 49

fixed-length character strings, 552“554
future customer behaviors,

flat files, dumping data, 594
predicting, 10

forced attrition, 118

G
forecasting

gains, cumulative, 36, 101

EBCF (existing base churn
Gaussian mixture model, automatic

forecast), 469

cluster detection, 366“367

NSF (new start forecast), 469

gender

survival analysis, 415“416

as categorical value, 239

former customers, customer
profiling example, 12

relationships, 457

generalized delta rules, 229

forward-looking businesses, 2

fraud detection, MBR, 258

628 Index


genetic algorithms data as, 337
case study, 440“443 directed, 330
crossover, 430 edges, 322
data representation, 432“433 graph-coloring algorithm, 340“341
genome, 424 Hamiltonian path, 328
implicit parallelism, 438 linkage, 77
maximum values, of simple nodes, 322
functions, 424 planar, 323
mutation, 431“432 traveling salesman problem, 327“329
neural networks and, 439“440 vertices, 322
optimization, 422 grouping. See clustering
overview, 421“422 GUI (graphical user interface), 535
resource optimization, 433“435
H
response modeling, 440“443
Hamiltonian path, graph theory, 328
schemata, 434, 436“438
hard clustering, automatic cluster
selection step, 429
detection, 367
statistical regression techniques, 423
hazards
Genetic Algorithms in Search,
bathtub, 397“398
Optimization, and Machine Learning
censoring, 399“403
(Goldberg), 445
constant, 397, 416“417
geographic attributes, market based
probabilities, 394“396
analysis, 293
proportional
geographic information system
Cox, 410“411

(GIS), 536
discussed, 408

geographical resources, 555“556
examples of, 409

geometric distance, automatic cluster
limitations of, 411“412

detection, 360“361
real-world example, 398“399

gigabytes, 5
retention, 404“405

Gini, Corrado (Gini splitting criterion,
stratification, 410

decision trees), 178
Hertzsprung-Russell diagram,
GIS (geographic information
automatic cluster detection,
system), 536
352“354
goals, formulating, 605“606
hidden distance fields, distance
Goldberg (Genetic Algorithms in
function, 278
Search, Optimization, and Machine
hidden layer, feed-forward neural
Learning), 445
networks, 221, 227
good customers, holding on to, 17“18
hierarchical categories, products, 305
good prospects, identifying, 88“89
histograms
Goodman, Marc (projective
data exploration, 565“566

visualization), 206“208
discussed, 543

graphical user interface (GUI), 535
statistics and, 127

graphs
historical data
acyclic, 331

customer behaviors, 5
cyclic, 330“331

documentation as, 61
Index 629


inconclusive survey responses, 46

MBR (memory-based reasoning),
inconsistent data, 593“594

262“263

index-based scores, 92“95

neural networks, 219

indicator variables, 554

predication tasks, 10

indirect relationships, customer

hobbies, house-hold level data, 96

relationships, 453“454

holdout groups, marketing

industry revolution, 18

campaigns, 106

inexplicable rules, association rules,

home-based businesses, 56

297“298
house-hold level data, 96

information

hubs, link analysis, 332“334

competitive advantages, 14

hyperbolic tangent function, 223

data as, 22

hypothesis testing

infomediaries, 14

confidence levels, 148

<<

. 128
( 137 .)



>>