Modélisation de la valeur client avec des approches ... · Modélisation de la valeur client avec...
Transcript of Modélisation de la valeur client avec des approches ... · Modélisation de la valeur client avec...
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
Modélisation de la valeur client avec des approches déterministes et
stochastiques. Investigation de l'impact de l'hétérogénéité
Michel CALCIU
Docteur
IAE - Université des Sciences et Technologies de Lille - LEM
Francis SALERNO
Professeur
IAE - Université des Sciences et Technologies de Lille - LEM
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
Modélisation de la valeur client avec des approches déterministes et stochastiques.
Investigation de l'impact de l'hétérogénéité
Résumé (FR):
Ce papier fait un panorama des modélisations de la valeur actualisée du client et se concentre
sur sur deux principales méthodes de calcul: l'approche déterministe et stochastique. La
première ignore l'hétérogénéité des taux de rétention et/ou attrition au sein d'une cohorte. Elle
produit de formules faciles à utiliser par les managers et permet de résoudre une palette plus
large de problèmes manageriales. La deuxième approche apporte plus de précision dans les
calculs en intégrant l'hétérogénéité des taux de rétention et d'attrition, mais elle les mesure en
tant que variables actuarielles et non pas comme réponses à des efforts marketing. Nous
introduisons des formules pour calculer le nombre résiduel de transactions espérées
conditionné par le profil d'achat passé pour des modèles déterministes dans des situations
non-contractuelles et proposons des solutions pour évaluer les erreurs de prédiction par
rapport aux modèles stochastiques.
Mots clés : capital client, CRM, modélisation
Customer Litetime Value Modelling with deterministic and stochastic approaches.
Investigating the impact of ignored heterogeneity
Abstract (EN):
This paper gives a panorama of Customer Lifetime Value (CLV) modelling and focuses on
the two main categories of CLV calculation methods: the deterministic and the stochastic
approach. The first adopts simplified calculations that ignore heterogeneity in customers'
retention and/or churn rates within a cohort. It produces formulas that are easy to use by
managers and solves a larger number of managerial problems. The second approach brings
much more precision to CLV calculations by thoroughly considering retention and/or churn
rate heterogeneity, but measures them as actuarial variables and not as response to marketing
effort. We derive formulae to compute expected residual transactions conditional on customer
purchase profile for deterministic models in non-contractual contexts and suggest solutions to
evaluate forecasting error as compared to stochastic models.
Key words: .customer equity, CRM, modelling
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
1
1.Introduction
The diffusion of interactive marketing and of data base technologies increased availability of
customer transaction data. This leads in most contexts to the adoption of customer and
customer portfolio evaluation methods that historically have emerged in the direct marketing
and catalogue sales industry. CLV is the discounted (actualised) value of cash flows generated
by a customer or a customer cohort during his «life» with a firm. The term Net Present Value
(NPV) has also been used in order to indicate that customer value is a financial measure
evaluating an investment (acquisition costs) that produces future cash flows. These cash flows
decline with time due to customers' churn. At the level of an individual customer or of a
cohort of customers CLV is usually calculated as a sum of discounted gains or cash flows
ignoring the initial investment or acquisition costs. Blattberg & Deighton (1996) coined the
term Customer Equity (CE) to designate the CLV from which acquisition costs are subtracted.
The same term has been used more recently in order to designate the net present value of the
customer base. As a company's customer base is formed dynamically by a sequence of
customer cohorts, in order to avoid confusions, Villanueva and Hanssens (2007) suggest using
the term Static Customer Equity (SCE) at individual customer of cohort level and the term
Dynamic Customer Equity (DCE) at the customer base level. This last measure as to Gupta et
al. (2004) is under certain circumstances a good proxy for the value of a company as it
accounts for both current and future relationships.In order to offer better definitions for the
CLV and CE concepts at individual customer (or cohort) level Pfeifer et al. (2005) introduce
the distinction between the value of a just-acquired customer and the value of a new or
existing one. If we note CLV as the value of a just-acquired customer (see table 1), that is a
customer value calculation starting just after the acquisition period, the value of a new or
existing customer includes also the margin (m) from the previous period (which for a new
customer is the acquisition period). CE subtracts acquisition costs from the value of a new
customer (these can be expressed as a fraction between prospect acquisition costs (A) and the
acquisition rate (a) that could be achieved due to these spendings).
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
2
Tableau 1 – Customer value at various client/prospect states (adapted from Pfeifer & al.,2005)
Before presenting CLV calculations and modelling approaches, it is necessary to distinguish
between two types of temporal (dynamic) customer behaviour in his relationship with a
company. They have been named differently according to authors: "lost for good "/ "always
has share" by Jackson (1985), "retention model"/ "migration model " (Dwyer, 1989) or
"contractual"/"non-contractual" (Reinartz and Kumar, 2000).The retention model considers
that person or a company remains a customer as long as they generate transactions. The
migration model considers that customers can reappear (turn up again) after some periods
during which they did not make transactions and traces their probability of ‘reactivating’.
Our study is organised as follows. After a presentation of wider CLV modelling approaches,
two main categories of CLV calculation methods are analysed: the deterministic and the
stochastic approach. In the final part we develop formulae and apply them to investigate the
impact of ignoring heterogeity in non-contractual contexts using two datasets.
2.A review of CLV modelling approaches
Jain and Singh (2002), in one of the first and most frequently cited reviews of the CLV
literature, identify three main research directions. The first is the development of CLV
calculation models that focus on the revenue stream from customers, and on acquisition,
retention and other marketing costs in order to facilitate calculations, resource allocation and
optimisation of CLV. The second research stream, described as customer base analysis,
concentrates on methods that analytically predict the probabilistic value of customer
transactions from the existing customer base. The last direction uses analytical models in a
normative way and it analyses CLV implications for managerial decisions. As in the mean
time literature on the subject has largely increased and approaches have become more diverse,
other classification criteria and perspectives have been added. Gupta et al. (2006) for example
Measure of customer lifetime value Calculation
Value of a just-acquired customer CLV
Value of a new or existing customer m+CLV
Customer equity (CE) -A/a+m+CLV
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
3
suggest a CLV analysis framework that takes as a starting point the customer acquisition, -
retention and - expansion (cross- and up selling) effects on customer value. Subsequently
interactions between these effects are analysed first at the individual and/or cohort level and
then at the firm level in order to determine the customer base value of the company. By
following this framework in order to structure modelling approaches the authors also suggest
a classification that distinguishes between classical RFM (Recency, Frequency and Monetary)
methods, stochastic- (probability), econometric-, persistence-, computer science – and growth
or diffusion models. By developing this same framework where customer acquisition,
retention and expansion through cross-selling and up-selling are seen as determinant of
Customer Equity, Villanueva and Hanssens (2007) classify CE models according to the kind
of data sources they deal with whether these are internal company databases, survey data,
company reports, panel data or even data on managerial judgments obtained by decision
calculus. The subarea of CLV modelling approaches that we are interested in may be called
CLV calculation models in the spirit of Jain and Singh (2002). As the diversity of CLV
calculation models is rather big, we adopt a progressive approach in dealing with these
models. We start with a simple generic model and increase complexity gradually by adding
dynamic customer behaviour contexts (contractual, non-contractual), transaction occasion’s
chronology aspects (discrete and continuous time), customer heterogeneity (deterministic and
stochastic models) as well as various analysis levels (individual, cohort or customer base). By
crossing CLV calculation models on two dimensions several modelling contexts can be
identified as shown in table 2.
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
4
Industry Magazine subscriptions, Insurance
policies, etc.
Credit cards,
Mobile phone usage Contractual
Dynamic
Industry Catalogue sales, Events attendance,
Charity fund drives,
Retail purchases, Doctor visits,
Hotel stays
Type of
customer
relationship Non-
contractual Dynamic
Discrete Continuous
Transactions occasions
Source: adapted from Fader & Hardie (2008) p.10
Tableau 2 - Adequation between activities and/or industries and the typology of customer
relationships, examples.
The basic structural model (Jain et Singh, 2002) outlines the financial dimension of CLV:
CLV=∑t= 1
T M t− Ct
1+dt − 0 .5 (1)
where t = the customer transaction period; M = margin obtained and C = customer incumbent
costs in period t; n = the number of total periods (years) of the expected lifetime of a
customer.
The gain in each period is multiplied by a discount factor 1/(1+d) at a given power (here t-
0.5, which tries to fit the mean time between revenue and spending flows). This model is
general enough as the displayed periodic margin and cost, encapsulates various transactional
dynamics generating those financial flows. These flows can be the result of retention and
survival rates or probabilities when the context is contractual but can also result from
customer reactivation when the customer relationship context is non-contractual. As to Gupta
et al. (2006) CLV differs in two fundamental aspects from the purely financial net present
value measure. First it is estimated at an individual level or segment level and second it
explicitly integrates possible future customer attrition
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
5
3.Deterministic and stochastic CLV models
The focus of the study is on deterministic and stochastic models belonging to the so-called
« buy till you die » framework (Schmittlein et al. ,1987; Fader & Hardie, 2008) which is
probably one of the most important research streams in CLV modelling nowadays. Both
assume individually constant buy and die probabilities for customers. While deterministic
models are suited for individual CLV calculations, stochastic models should be used when
computing CLV at an aggregated level (customer cohorts or customer base) as they integrate
heterogeneity of buy and/or die probabilities among individuals.
Deterministic retention models
More simple deterministic retention models are a good bases to introduce some fundamental
aspects of CLV calculations. By ignoring the temporal shift between retention gains and
spendings and regrouping them in order to form the net gain (g), that consists of the difference
between margin (m) and cost (c), and remains constant for a customer over time the CLV
calculation formula can become very concise. For a just acquired customer it is:
CLV=∑
t= 1
∞
gr t
1+dt=g
r1+d − r
(2)
where r is the retention rate and r/(1 + d − r) is called by Gupta and Lehman (2003) "margin
multiple". Gupta and Lehmann (2005) have shown that if gains increased with a constant rate
q the margin multiple becomes r/[1 + d– r(1 + q)]. By separating monetary flows from
transaction flows CLV can be expressed as the product of gains (here constant) and what has
been called by Calciu and Salerno (2002) discounted expected number of transactions or more
simply discounted expected transactions (DET). In other words CLV=g× DET . DET can be
seen as the CLV for transactions producing a one monetary unit gain (g = 1) . It can be used
as a proxy for CLV. DET for a just acquired customer is the margin multiple r/(1 + d − r) as
may be seen from formula (2). The DET for a new or existing customer is computed by
adding an initial transaction (see table 2). In order to predict future purchasing patterns by
those customers listed in the firm’s transaction database, and to generate estimates of their
expected lifetime value Fader, Hardie and Berger (2004) introduce the concept of « as-yet-to-
be-acquired » customers. A yet-to-be-acquired customer is seen as an existing customer that is
going to be retained (or acquired) which means that the DET of such a customer is the DET
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
6
for an existing customer multiplied by the retention rate (see table 3).
Measure of discounted expected transactions (DET) Calculation
DET of a just-acquired customer DET(just acquired) = r/(1+d-r)
DET of a new or existing customer DET(new or existing) = 1+ DET(just acquired) = (1+d)/(1+d-
r)
DET of a yet-to-be-acquired customer (DERT) DET(yet-to-be-acquired)=r DET(new or existing) =
r(1+d)/(1+d-r)
Table 3 – Discounted Expected Transactions calculations retention models
The DET for a customer "yet-to-be-acquired " at a given moment n, just before the decision
whether to repurchase (or renew the contract), represents in fact the discounted expected
residual transactions (DERT) from that moment on. It allows to compute the Discounted
Expected Residual Lifetime value (DERL) (Fader & Hardie, 2007b) of existing customers for
which it is a proxy. DERT is well adapted for situations where the retention rates while
constant individually are heterogeneous within a cohort, as it can be shown that in such
situations aggregated retention rate increases over time. Deterministic retention models have
been extended by Gupta et al. (2004) in order to measure the value of the customer base of a
company or DCE (the Dynamic Customer Equity) , which is defined as the CLV of current
and future customers. DCE calculated as the sum of all cohorts taking into account new
customer acquisitions accounted for on an annual bases, can be obtained using the following
discrete time1 formula :
DCE=∑k= 0
∞ nk
1+dk∑t=k
∞
mt − kr t− k
1+dt − k–ck (3)
Where nk is the number of customers in cohort k. The initial number of customer in each
cohort is determined using a diffusion growth model.
While producing very useful managerial applications all determinist models as the one
1 The calculation for continuous time is:
DCE= ∫k= 0
∞
∫t=k
∞
nk e− dkmt− k e−1+d − r
rt− k
dtdk–∫k= 0
∞
nk e− dkck dk
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
7
presented in this section when computing CLV at aggregated cohort or multi-cohort level
commit the error of ignoring heterogeneity of individual customer response probabilities. This
aspect is discussed later in this paper.
There is a fundamental marketing belief, that companies can influence customer response,
meaning that they can “control” customer acquisition and retention rates by varying
marketing efforts. This implies that it is also possible to optimally allocate marketing retention
and acquisition efforts. Formula 2 with retention rates expressed as a function of the retention
budget (r=f(R)) becomes:
CLV=m− R
r r1+d − r
=m− Rf R f R
1+d− f R (4)
The gain (g) is composed of the margin minus the average retention cost. It is expressed here
as the fraction of spendings per targeted customer (R) and achieved retention rate (r). This
formula helps find optimal retention spendings and is part of the elegant Blattberg and
Deighton (1996) model that also deals with optimal customer acquisition spendings using
formula 5.
CE = -A/a + m + CLV (5)
This model has recently been modified by Pfeifer (2005).
Deterministic migration models
The “always a share” behaviour is the alternate scheme to “lost for good” in what is known as
a dichotomy. Customer value comes here not only from surviving customers but also from
customers allowed to reactivate after a given number of inactive periods. Customers are
considered as “lost for good” only after exceeding that number of successive periods of
inactivity. By reducing the tolerated number of successive periods of inactivity to zero the
always a share model reduces to the lost for good model who can be seen as a special case of
this more general model. In non-contractual settings firms cannot know when a customer
becomes inactive. Intuitively they can apply rules (conventions) based upon recency,
frequency and monetary amount (RFM) of past purchases in order to decide whether a
customer is still active or not. By fixing recency, frequency and monetary states, based on past
behaviour, transition probabilities from one state to another can be computed and organised
into a matrix of transition probabilities in order to form a Markov Chain. If we use only
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
8
purchase recency the migration scheme that controls the construction of the transition
probabilities matrix is shown in figure 1. It indicates that a customer in recency t can either
buy and pass in recency 1 or not buy and pass in an inferior recency state t+1.
R(1) <------R(t)-------->R(t+1)
Figure 1 - Migration scheme between Recency States
A detailed discussion of the matrix approach applied to customer migration can be found in
Pfeifer and Carraway (2000). One can also find there a detailed migration scheme between
R,F,M states and a graphical representation of transitions in and out of a given RFM state. A
practical application is given further in this paper. The matrix approach based on Markov
Chains in order to compute customer value has been used by Bitran and Mondschein (1996),
Pfeifer and Carraway (2000) using RFM variables in order to define transition states. Rust,
Lemon, and Zeithaml (2004) have defined transition probabilities matrix between brands that
vary over time following a logit model. Using the analogy between the retention probability
(r) and the transition probabilities matrix (P) it is easy to derive equivalent matrix formulas for
DET as the ones in table 4:
Measure of discounted expected transactions (DET) Calculation
DET of a just-acquired customer DET(just acquired) = [I − P /1+d− 1
− I ]
DET of a new or existing customer DET(new or existing) = I+ DET(just acquired) =
I − P/1+d− 1
DET of a yet-to-be-acquired customer (DERT) DET(yet-to-be-acquired)=P DET(new or existing) =
PI − P /1+d− 1
Table 4 – Discounted Expected Transactions calculations for migration models
While the formula for new and existing customers has been suggested by Pfeifer & Carraway
(2000), the other formulae are a contribution of this research. They will help compare
deterministic and stochastic approaches to CLV calculations, later in this paper.
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
9
4. Stochastic CLV Models
Probability or stochastic models consider observed behaviour as the result of an underlying
stochastic process controlled by unobserved latent characteristics that vary across individuals.
As to Gupta et al. (2006), they try to find a simple paramorphic representation that describes
and predicts observed behaviour instead of trying to explain differences between explained
and observed behaviour as a function of covariables (as in regression models). They assume
that a given behaviour varies across the population according to a probability law. Their main
advantage is that they take into account and measure heterogeneity in observed behaviour.
Their main disadvantage is that by measuring the characteristics of those probability laws they
assume that it is only these laws that control observed behaviour and don't leave any place to
marketing variables as if the dynamic customer behaviour couldn't be changed by marketing
activities2. One of the most respected stochastic CLV models and a forerunner of recent
developments is the Pareto/NBD model that has been developed by Schmittlein, Morrison and
Colombo (1987). It is the benchmark model for non-contractual settings where transactions
can occur at any moment over time. It is not well suited for purchase situations that occur at
fixed time intervals and even less suited for contractual settings.
Contractual
Shifted Betageometric (sBGD-Fader
& Hardie, 2006,2007)
Beta-discrete-Weibull (BdWD - Fader
Expontial Gamma (EGD or Pareto
Distribution of second kind)
Type of
customer
relation-
ship Non-contractual
Betageometric Betabinomial
(BG/BBD -Fader, Hardie & Berger,
2004)
Pareto/NBD (Schmittlein, Morrison
& Colombo,1987),
Betageometric/NBD (Fader,Hardie,
2005),
Modified Betageometric/NBD
(Batislam et al. 2007)|
Discrete Continuous
Transaction cccasions
Table 5 - Stochastic Models covering various dynamic customer behaviour contexts
2 There are several studies that try to extend these models in order to integrate marketing covariates. Schweidel et al. (2008) use a mixed effects hazard model with both fixed and random components. Strictly speaking such models exceed the definition that is commonly given to stochastic models and should be classified as econometric models.
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
10
The series of relatively recent developments undertaken by authors like Fader and Hardie, that
have begun with a modification of the Pareto/NBD the BG/NBD model (Fader, Hardie & Lee,
2005) that substantially simplifies parameter estimation, have lead these authors to
progressively cover all dynamic customer behaviour contexts presented in table 2 with
stochastic models matching each context and its corresponding industries . The names of
these models, their authors and the year of publication are given in table 5. If we take only the
prototype models (that are marked with bold characters) for each case in table 5 and try to
synthesize, we could say that they use as mixing distributions that account for the
heterogeneity in buy and/or die probabilities, beta distributions for discrete time models and
gamma distributions for continuous time models. For the simpler contractual case the
individual buying process representing the number of purchases during a given period for a
given buying probability3 follows a shifted geometric distribution in the discrete time case
and an exponential distribution in the continuous case. In the non-contractual settings the buy
and die probabilities are distinct and the probability distributions that represent the buying and
dying processes need to be integrated as subprocesses in order to represent the combined
behaviour. It follows that for the discrete case the individual buying process follows a
Bernoulli distribution and the dying process a geometric distribution while in the continuous
case the buying process is a Poisson distribution and the dying process is exponential. For
most of these models closed form expressions for the CLV could be derived as well as other
useful linked proxy measures like DET and DERT. For example the DERT for the stochastic
migration model in discrete time the beta-geometric/beta-binomial (BG/BB) distribution
model which will be used later in this paper in a comparison with the deterministic migration
model is given by the following formula:
Bα+x+ 1,β+n− x
Bα,βBγ,δ+n+ 1
Bγ,δ 2F11,δ+n+ 1,γ+δ+n+ 1,1/1+d/ Lα,β,γ,δx,n,m (6)
where (x,n,m) represent purchase history with x = number of purchases, m = recency, n= the
current period; (α,β,δ,γ ) are the estimated beta distribution parameters defining the
heterogeneous buying and dying probabilities; B(.) is the Beta function, 2 F1.is the
Gaussian hypergeometric function and L(.) is the Likelihood. Research on stochastic models
3 Which in contractual settings is equivalent to the (1- die) probability.
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
11
for customer database analysis is very active. Existing models are modified or extended for
example Fader and Hardie (2007b) allow for the incorporation of time-invariant covariate
effects into Pareto/NBD and BG/NBD models. The BG/NBD model developed in 2005 as an
alternative to the Pareto/NBD model, has itself been modified and extended by several
researchers (Batislam et al. 2007, 2008; Fader and Hardie 2007b; Hoppe and Wagner 2007;
Wagner and Hoppe 2008). New ways to make parameters estimation and integration of
stochastic models are being explored. Several researchers use MCMC methods for this
purpose. For example Ma and Liu (2007) and Abe (2008) use the Gibbs sampler for parameter
estimation while Singh, Borle and Jain (2007) try to build a generalized framework for
estimating CLV when lifetimes are not observed using the Metropolis Hastings algorithm.
5.Investigating the impact of heterogeneity on CLV
In their yet unpublished paper called « Customer-Base Valuation in a Contractual Setting:
The Perils of Ignoring Heterogeneity » Fader and Hardie (2006) show that models that
assume constant retention at cohort or firm level are wrong as typically retention rates that can
be constant individually increase over time at the aggregate cohort level. They warn managers
and researchers not to use such models and suggest academics not to teach formulae for
computing CLV based on them. This is a rather strong point of view that seems to wipe out a
substantial area of research that has been dominant in the early stages of research in CLV
literature. We agree with all arguments showing that ignoring heterogeneity in customer
response rates is producing erroneous results and show how to extend exploring the precision
gap between those incriminated models and correct stochastic models in a non-contractual
setting. But by evaluating this precision gap we don't go as far as to ban those homogeneous
response models and calculation formulae and seek in this way a mean to keep them as we
think that they are more flexible, easier to use and understand and can integrate marketing
mix. In their above mentioned paper Fader and Hardie (2007) use the two datasets from a
previous version of their paper introducing the Shifted Beta Geometric distribution model
(Fader & Hardie, 2007) in order to compare the latter with the fixed retention rate model CLV
calculation given in formula 4. In order to extend the investigation of the impact of
heterogeneity to non-contractual situations we use the formula we have derived in table 3 for
computing Discounted Expected Residual Transactions (DERT) for a customer yet-to-be-
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
12
acquired in order to produce Lifetime Value calculations for a customer dynamic behaviour
where buying and « death » probabilities are individually constant over time and “wrongly”
considered as homogenous within a cohort. Additional matrix formulations for expected
residual transactions at various discrete time periods are used in order to observe the
increasing precision gap between this model that ignores heterogeneity and the corresponding
stochastic model for non-contractual settings the "beta-geometric/beta-binomial " (BG/BB)
model (Fader, Hardie & Berger, 2004). In our investigation we also use two datasets. The first
is the Cruiseship dataset (Berger, Weinbert & Hanna,2003; Fader, Hardie & Berger, 2004)
consisting of cruise-line transactions for a cohort of 6094 customers over a period of five
years. The second is a catalogue sales dataset that for confidentiality reasons we will call
"Brabant". It consists of catalogue orders for a cohort of 36882 customers over seven seasons.
As opposed to the first dataset in the second one the buying behaviour shows high seasonal
effects with significantly more customers buying in autumn and less customers buying in
spring. As the BGBB model is not yet prepared to deal with such effects4 this opportunity is
used in order to point out some limitations that these stochastic models can have in dealing
with real world situations.
4 Extending the BG/BB model in order to integrate sesonality effects is a future reseach direction that we have started to explore.
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
13
Source Fader, Hardie & Berger, 2004, p.10
(a) Observed buying behaviour (b) RF Transition Matrix (P)
Figure 2 Building the transition matrix from observed buying behaviour
The main contribution of this exercise is to adapt the homogeneous and constant transition
probabilities matrix approach to make it comparable to the corresponding stochastic approach.
We use the decision tree recording whether customers took cruises each year (Fig. 2a) to
compute the transition probabilities matrix (Fig. 2b). By applying expected residual
transactions calculations for both the deterministic migration model and the stochastic model
table 5 and figure 3 are obtained. From table 6 it can be seen that the model ignoring
heterogeneity undervalues the long term expected transactions and the discounted expected
transactions together with its proxy the CLV.
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
14
Transactions
aft. 10 pds.
Expected
Transactions
DET Etats
determinist stochastic determinist stochastic determinist stochastic
r1f3 0.53 2.05 0.53 3.57 0.51 1.77
r1f2 0.69 1.52 0.69 2.65 0.62 1.31
r1f1 0.44 0.97 0.44 1.68 0.38 0.83
r2f2 0.55 1.28 0.55 2.23 0.53 1.10
r2f1 0.25 0.85 0.25 1.47 0.22 0.73
r3f1 0.10 0.72 0.10 1.25 0.09 0.62
r4 0.00 0.28 0.00 0.48 0 0.24
Table 6 - Expected transactions calculations for non-contractual settings
(a) determinist migration model (b) BG/BB Model
Figure 3 - Evolution of residual expected transactions using determinist and stochastic models
Figure 3 also shows that this undervaluation of the expected transactions is increasing with
the number of transaction periods. The same result is obtained for the catalogue sales dataset.
Models ignoring heterogeneity fail to recognise the sorting effect taking place in a
heterogeneous cohort that makes individuals with low buying probability and higher
probability to « die » leave the cohort earlier, so that over time the cohort progressively
retains customers with higher repeat buy probabilities and lower probabilities to die.
Therefore those models produce a downward-biased estimation of expected transactions and
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
15
lifetime value.
The heterogeneity measures of stochastic models can also be used to evaluate the size of the
downward bias induced by deterministic models. In discrete time models the mixing
distribution used to account for heterogeneity is the beta distribution. It's two parameters α
and β can be characterized in terms of the mean µ = E(p) =α/(α + β) and the polarization index
φ = 1/(α + β + 1). The polarization index is a measure of heterogeneity. When α, β tend
towards zero the polarization index tends towards 1 and the beta distribution of the studied
probability p becomes U-shaped. Its values are concentrated near p = 0 and p = 1 meaning
high heterogeneity. Conversely when these parameters are big (α and β →∞) the polarisation
index tends towards zero and the beta distribution becomes a spike at its mean. In the case of
the Cruiseship dataset, beta distributions parameters for the purchase probability are α =
0.657, β = 5.193 indicating a mean buying probability of α/(α + β) =0.11 and a polarisation φ
= 1/(α + β + 1) = 0.146. The corresponding beta distribution parameters for the probability to
“die” γ = 173.761, δ = 1882.928 indicate that as regards the dying process this cohort is
highly homogeneous with a polarisation index near to zero and a spike at the mean probability
of 0.084. In the case of the catalogue sales dataset we have a higher mean buying probability
of 0.26 and a higher polarisation index 0.216 indicating more heterogeneity. The customer
death process is much less intense with a mean probability of 0.02 but with higher
polarization (0.18) than in the cruiseship case. This explains why long term expected
transactions and the DET measure for CLV are significantly higher for the catalogue sales
dataset. Further investigations of heterogeneity effects on long term expected transactions and
customer value can be made for each of the to dynamic processes that are combined in the
non-contractual settings, the buying process and the death process. A good illustration limited
only to the buying process is given in Fader and Hardie (2006).
6.Discussions and further research
The main purpose of this investigation of the impact of heterogeneity in non-contractual
setting was to show that it was possible to evaluate this impact and eventually how this could
be done. Obtaining a high degree of precision in such an exercise remains a future research
objective. While the parameter estimation methodology from purchase history data is well
defined for the stochastic migration model, this is not yet the case for the “deterministic”
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
16
Markov chain model. From the few studies mentioned before that have applied Markov chain
transition matrix based migration models none seems to have suggested a particular
methodology for inferring transition probabilities from real data.
A potential solution to the problem of truncated transition probabilities would be to infer from
available data some causal relationship between purchase probabilities, recency and
frequency.
In general using deterministic CLV formulas on aggregated cohort data is error-prone and
should be avoided. Nevertheless there can be situations when a cohort's buying behaviour is
homogeneous. As segmentation seeking groups of customers with a homogeneous behaviour
is a fundamental marketing activity, there must be situations where cohorts are highly
homogeneous. Even in not particularly segmented data as the cruiseship dataset, the observed
cohort was highly homogeneous as concerns the death rate as indicated by the spike in the
estimated beta distribution. In such situations deterministic models could be used without
particular precautions. The use of stochastic models as precision benchmarks in CLV
calculation has been questioned in a recent study by Wübben & Wagenheim (2007). These
authors have compared predictive performance of stochastic non-contractual models
(Pareto/NBD and BG/NBD) to the one of simple managerial heuristics on three aspects: (1)
distinguish active customers from inactive customers , (2) generate transaction forecasts for
individual customers and determine future best customers, and(3) predict the purchase volume
of the entire customer base. They found that the simple heuristics perform at least as well as
the stochastic models with regard to all managerially relevant areas, except for predictions
regarding future purchases at the overall customer base level. The heuristics used were: (1)
the simple recency hiatus to distinguish between active and inactive customers; (2) the past
10% best customers in a customer base will also be the future 10%; (3) every customer
continues to buy at his or her past mean purchase frequency. Whilehaving better predicted
purchase volume at the overall customer base level (which is equivalent to customer equity)
as compared to the simple heuristics that extrapolates individual past purchases in the future it
is not sure whether stochastic models would have outperformed well tuned deterministic
models under the same conditions. This doubt is implicitly expressed by Wübben &
Wagenheim (2007) when they suggest that “it would be worthwhile to benchmark these
models against the approach that Gupta, Lehman, and Stuart (2004) use”. These deterministic
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
17
models provide flexible and easy to use formulas that could be substituted to simple
heuristics5 and under this cover coexist with the more sophisticated stochastic models. All this
justifies efforts towards making stochastic and deterministic CLV calculation models
comparable.
7.Conclusions
This study reviews CLV modelling approaches. It covers CLV computations at customer,
cohort and company customer base level. The managerial problems dealt with include
calculation aspects, CLV optimisation and optimal balance between customer acquisition and
retention spendings. During this review several concepts and formulae have been newly
organised and some new formulae have been derived. By introducing a clear distinction
between CE at individual and cohort level as opposed to the customer base level we help
disambiguate this term. The concept of residual expected transactions (Fader & Hardie, 2007)
after a given time period and for the long term is formalised for deterministic non-contractual
settings allowing for comparisons with corresponding deterministic and stochastic models.
The focus of the study is on deterministic and stochastic models belonging to the so-called
« buy till you die » framework (Schmittlein et al. ,1987; Fader & Hardie, 2008) which is an
active research stream in CLV modelling nowadays. Both assume in most cases individually
constant buy and die probabilities for customers. The deterministic models by ignoring
heterogeneity in buy and die probabilities that normally characterises cohorts of customers are
less sophisticated, easier to develop and implement. Consequently they offer solutions to a
larger area of managerial problems. An overview of these solutions is given in the paper in
order to outline problems that have not yet been solved with stochastic models. Deterministic
models are also easier to understand by managers as conceptually closer to simple heuristics.
While recognising the conceptual superiority and the quality of the measurement attained
through stochastic models we cannot oversee some difficulties in their estimation and at this
stage their limitations as concerns integration of marketing mix variables and sometimes their
lack of flexibility when adapting to real world situations. While some recent studies try to
5 A more detailed discussion of managerial customer base analysis heuristics and of the reasons why managers tend to use them instead of adopting more sophisticated academic methods can be found in Wübben & Wagenheim (2007).
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
18
integrate stochastic models into more flexible frameworks like using MCMC methods for
parameter estimation, these models still need heavy development in order to achieve the
flexibility of deterministic models. This concern motivated us to find ways to explore the
impact of heterogeneity in non-contractual setting. Using the Markov Chain approach to
represent a deterministic customer migration model we derive formulae for computing the
residual expected transactions after a given time period and for the long term which in their
discounted form are a proxies for residual lifetime value calculations. In this way during each
(discrete time) period of the lifetime of a cohort the increasing precision gap between between
the deterministic migration model that ignores heterogeneity within a cohort and the
corresponding BG/BB stochastic model can be evaluated. Illustrations have been given for
two datasets. The observed deviations are rather high. This justifies baning the use of
deterministic models in CLV calculations at aggregated levels like cohorts or company
customer bases in non-contractual setting, as was suggested by Fader & Hardie who
investigated the impact of heterogeneity on CLV in contractual settings. Nevertheless we also
see in the possibility of evaluating the precision gap a potential mean to use deterministic
models as proxies for stochastic models in order to offer « satisficing » solutions to
managerial problems that cannot yet be efficiently solved with stochastic models.
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
19
References
Abe, Makoto (2008), “"Counting Your Customers" One by One: A Hierarchical Bayes
Extension to the Pareto/NBD Model,” Marketing Science.
<DOI:10.1287/mksc.1080.0383>
Bauer, Hans H., Maik Hammerschmidt and Matthias Braehler (2003), The Customer Lifetime
Value Concept and Its Contribution to Corporate Valuation, Yearbook of Marketing and
Consumer Research, Volume 1, GfK-Nürnberg, e.V., Berlin: Duncker & Humblot, 47–67.
Batislam, E. P., Denizel M, and Filiztekin A.(2007), Empirical Validation and Comparison of
Models for Customer Base Analysis, International Journal of Research in Marketing, 24
(September), 201–209.
Batislam, E.P., Denizel M, and Filiztekin A. (2008), Formal response to “Erratum on the
MBG/NBD Model”, International Journal of Research in Marketing ,25 (September), 227.
Berger P.D., Weinberg B. and Hanna R.C. (2003),Customer Lifetime Value Determination and
Strategic Implications for a Cruise-Ship Company, Journal of Database Marketing and
Customer Strategy Management, 11 (September),
40-52.
Blattberg, R.C. and J. Deighton, (1996) Manage marketing by the customer equity test,
Harvard Business Review, 74, 4 (July-August), 136-144.
Calciu, M. and Salerno F. (2002), “Customer Value Modeling: Synthesis and Extension
Proposals,” Journal of Targeting,Measurement and Analysis for Marketing, 11 (2), 124–47.
Dwyer F.R. (1989) Customer Lifetime Valuation to Support Marketing Decision Making,
Journal of Direct Marketing, 9, 1, 79-84.
Fader, P.S., Bruce G.S. Hardie and Paul D. Berger (2004), Customer-Base Analysis with
Discrete-Time Transaction data, working paper,Marketing department, The Wharton
School, University of Pennsylvania.
Fader P.S., Hardie B.G. S. and Lee K.L. (2005), “"Counting Your Customers" the Easy Way:
An Alternative to the Pareto/NBD Model,” Marketing Science, 24 (Spring), 275–284.
Fader, P.S., Hardie B.G. S. (2006), “Customer-Base Valuation in a Contractual Setting: The
Perils of Ignoring Heterogeneity.” <http://brucehardie.com/papers/022/>
Fader P.S, Hardie B.G.S. (2007a) How to project customer retention, Journal of Interactive
Marketing,21,1,76-90
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
20
Fader P.S, Hardie B.G.S. (2007b) Incorporating Time-Invariant Covariates into the
Pareto/NBD and BG/NBD Models, http://brucehardie.com/notes/019
Fader P.S., Hardie B.G.S (2008) Probability Models for Customer-Base Analysis, Journal of
Interactive Marketing, forthcomming, 21 pgs.
Gupta, S., Lehman D. R. , Stuart J. A. (2004), ‘Valuing customers’. Journal of Marketing
Research, 41,1, 7–18.
Gupta S. , Lehmann D.R. (2003), Customers as Assets, Journal of Interactive Marketing, 17
(1), 9-24.
Gupta S. , Lehmann D.R. (2005), Managing Customers as Investments. Philadelphia: Wharton
School Publishing.
Jackson, B. B. (1985) Winning and keeping industrial customers: The dynamics of customer
relationships, Lexington Books, Lexington, MA.
Jain D., Singh, S.S., (2002). Customer Lifetime Value Research in Marketing. A Review and
Future Directions. Journal of Interactive Marketing, 16(2), 34-46
Hoppe, D. and Wagner U. (2007), Customer Base Analysis: The Case for a Central Variant of
the Betageometric/NBD Model, Marketing - Journal of Research and Management, 3 (2),
75–90
Pfeifer P.K, Carraway (2000) Modelling Customer Relationships as Markov Chains, Journal
of Interactive Marketing, 14,2,43-55
Pfeifer P.K, Haskins M.E., Conroy R.M (2005) Customer Lifetime Value, Customer
Profitability, and the Treatment of Acquisition Spending, Journal of Managerial Issues,17,
1,11-25
Reinartz W. J and Kumar V. (2000) On the Profitability of Long-Life Customers in a
Noncontractual Setting: An Empirical Investigation and Implications for Marketing,
Journal of Marketing, 64, October, 17-35.
Reinartz, W. and V. Kumar (2003), ‘The impact of customer relationship characteristics on
profitable lifetime duration’. Journal of Marketing 67(1), 77–99.
Schweidel D.A., Fader P.S and Bradlow E.T (2008) Understanding Service Retention Within
and Across Cohorts Using Limited Information, Journal of Marketing, 72, January, 82-94.
Ma S-H; Liu J-L; (2007) The MCMC Approach for Solving the Pareto/NBD Model and
Possible Extensions, ICNC 2007. Third International Conference on Natural Computation,
Actes du 25e Congrès International de l’AFM – Londres, 14 et 15 mai 2009
21
Volume 2, 24-27 Aug., Page(s):505 - 512
Singh, S. S., Borle, S. and Jain, D. C., (2007) A Generalized Framework for Estimating
Customer Lifetime Value When Customer Lifetimes are Not Observed, (October 31).
Available at SSRN: http://ssrn.com/abstract=1154709
Villanueva J., Hanssens D. (2007) Customer Equity: Measurement, Management and
Research Opportunities,Foundations and Trends® in Marketing,1,1,1–95, http://dx.doi.org/
10.1561/ 1700000002
Wagner U., and Hoppe D. (2008), Erratum on the MBG/NBD Model, International Journal of
Research in Marketing ;25 (September), 225–226.