<<

. 122
( 137 .)



>>


aggression, behavior-based

147“148

variables, 18

ad hoc questions

AI (artificial intelligence), 15

behavior-based variables, 585

algorithms, recursive, 173

business opportunities,

alphas, decision trees, 188

identifying, 27

American Express

hypothesis testing, 50“51

as information broker, 16

additive facts, OLAP, 501

orders, market based analysis, 292

addresses, geographical resources,

555“556


615
616 Index


analysis sensitivity, 247“248
differential response, 107“108 sequential, 318“319
link analysis statistical
acyclic graphs, 331
acuity of testing, marketing
authorities, 333“334
campaign approaches, 147“148
candidates, 333
business data versus scientific
case study, 343“346
data, 159

classification, 9
censored data, 161

communities of interest, graphs, 346
Central Limit Theorem, 129“130

cyclic graphs, 330“331
chi-square tests, 149“153

data, as graphs, 340
confidence intervals, marketing

directed graphs, 330
campaign approaches, 146

discussed, 321
continuous variables, 137“138

edges, graphs, 322
correlation ranges, 139

fax machines, 337“341
cross-tabulations, 136

graph-coloring algorithm, 340“341
density function, 133

Hamiltonian path, graphs, 328
as disciplinary technique, 123

hubs, 332“334
discrete values, 127“131

Kleinberg algorithm, 332“333
experimentation, 160“161

nodes, graphs, 322
field values, 128

planar graphs, 323
histograms and, 127

root sets, 333
mean values, 137

search programs, 331
median values, 137

stemming, 333
mode values, 137

traveling salesman problem,
multiple comparisons, 148“149

graphs, 327“329
normal distribution, 130“132

vertices, graphs, 322
null hypothesis and, 125“126

weighted graphs, 322, 324
probabilities, 133“135

market based
proportion, standard error of,

differentiation, 289
marketing campaign
discussed, 287
approaches, 139“141

geographic attributes, 293
p-values, 126

item popularity, 293
q-values, 126

item sets, 289
range values, 137

market basket data, 51, 289“291
regression ranges, 139

marketing interventions, tracking,
sample sizes, marketing campaign

293“294
approaches, 145

order characteristics, 292
sample variation, 129

products, clustering by usage,
standard deviation, 132, 138

294“295
standardized values, 129“133

purchases, 289
sum of values, 137“138

support, 301
time series analysis, 128“129

telecommunications customers, 288
truncated data, 162

time attributes, 293

Index 617


sequential analysis, 318“319

variance, 138

for store comparisons, 315“316

z-values, 131, 138

trivial rules, 297

survival

virtual items, 307

attrition, handling different types

assumptions, validation, 67

of, 412“413

attrition

customer relationships, 413“415

discussed, 17

estimation tasks, 10

forced, 118

forecasting, 415“416

future, 49

time series

proof-of-concept projects, 599

neural networks, 244“247

survival analysis, 412“413

non-time series data, 246

audio, binary data, 557

SQL data, 572“573

authorities, link analysis, 333“334

statistics, 128“129

automated systems

of variance, 124

neural networks, 213

analysts, responsibilities of, 492“493

transaction processing systems, 3“4

analytic efforts, wasted time, 27

automatic cluster detection

AND value, neural networks, 222

agglomerative clustering, 368“370

angles, between vectors, 361“362

case study, 374“378

anonymous versus identified

categorical variables, 359

transactions, association rules, 308

centroid distance, 369

application programming interface

complete linkage, 369

(API), 535

data preparation, 363“365

architecture, data mining, 528“532

dimension, 352

artificial intelligence (AI), 15

directed clustering, 372

assessing models

discussed, 12, 91, 351

classifiers and predictors, 79

distance and similarity, 359“363

<<

. 122
( 137 .)



>>