Capturing Customer Heterogeneity using a Finite Mixture PLS Approach 
 
Carsten Hahn 
SAP AG, Neurottstrabe, 69190 Walldorf, Germany 
Michael D. Johnson 
University of Michigan Business School. 701 Tappan Street, Ann Arbor, Michigan 48109-1234, USA 
Andreas Herrmann 
University of St. Gallen, MCM Institute, Blumenbergplatz 9, 9000 St. Gallen, Switzerland 
Frank Huber 
University of St. Gallen, MCM Institute, Blumenbergplatz 9, 9000 St. Gallen, Switzerland 
 
 
Abstract 
 
An approach for capturing unobserved customer heterogeneity in structural equation modeling is 
proposed based on partial least squares. The method uses a modified finite-mixture distribution 
approach. An empirical analysis using quality, customer satisfaction and loyalty data for convenience 
stores illustrates the advantages of the new method vis-a-vis a traditional market segmentation scheme 
based on well known grouping variables. The results confirm the assumption of heterogeneity in the 
individuals’ perception of the antecedents and consequences of satisfaction and their relationships. The 
results also illustrate how the finite-mixture approach complements and provides insights over and 
above a traditional segmentation scheme. 
 
1 Introduction 
 
Understanding customers requires an understanding of segment-level differences or heterogeneity. One 
traditional approach for understanding heterogeneity is to use separate marketing research (interviews, 
focus groups and surveys) to identify a priori segments upon which subsequent research and analysis is 
based. A more recent trend in marketing research is to determine the segments when analyzing 
customer data using a latent class or finite mixture approach. Our particular interest is to better 
understand heterogeneity within structural equation modeling (SEM) in marketing, and specifically 
models that link customer perceptions of quality and price to customer satisfaction and loyalty. For 
example, is customer satisfaction and loyalty for a convenience store driven primarily by “convenience” 
for some and “safety” for others? 
 
Two popular SEM methods for estimating such models are covariance structure analysis or CSA (using 
programs such as LISREL) and partial least squares or PLS4. There are occasions and contexts where 
researchers prefer to use PLS when estimating a quality, satisfaction and loyalty model. But while the 
finite mixture approach has been added and applied to covariance structure analysis5, it has not been 
integrated with PLS. 
 
The goal of this research is to integrate the advantages of PLS with the advantages of a finite mixture 
approach to market segmentation. The integration is unique because it leverages the advantages of a 
least-squared procedure when operationalizing a satisfaction model and the advantages of a maximum 
likelihoodbased approach when deriving market segments. We compare and contrast the approach with 
a more traditional a priori segmentation scheme using data from a national convenience store survey. 
The new approach both complements the traditional segmentation scheme and provides unique 
insights into the drivers of customer satisfaction. We begin by describing satisfaction modeling and 
approaches to estimating models. We then describe our Finite Mixture PLS approach, our empirical 
study, and results. The advantages and disadvantages of the approach vis-a-vis traditional market 
segmentation are then described and discussed. 
 
2 Quality, satisfaction and loyalty modeling 
 
Models that link customer perceptions of quality and price to satisfaction and subsequent loyalty, or 
satisfaction models, have become common applications of SEM in marketing. Satisfaction models 
typically include the concrete attributes that describe a product or service, the benefits or consequences 
these attributes provide customers, a customer’s overall evaluation of their purchase and consumption 
experience (customer satisfaction), and the behavioral intentions or behaviors that result (such as 
repurchase, product recommendations or word-of-mouth, crossselling, or price tolerance). These 
models rest heavily on expectancy-value model formulations, where beliefs about the consumption 
experience (quality dimensions and price) affect customer satisfaction as a type of overall evaluation or 
attitude, which in turn affects customers’ behavioral intentions and behaviors. 
 
A key feature of satisfaction models is that the benefit, satisfaction, and loyalty constructs in the models 
are inherently abstract or latent variables. The most common way to empirically measure these latent 
variables is through the use of multiple concrete proxies or measurement variables. Benefits are 
measured using their attributes, satisfaction is measured using different overall evaluation standards 
(such as overall satisfaction, overall performance versus expectations, overall per formance versus an 
ideal), and loyalty is often measured using behavioral intentions (such as the likelihood of repurchase or 
recommendation to others). 
 
Statistical estimation of a satisfaction model must accommodate the fact that the model is a network of 
cause-and-effect relationships (as from quality, to satisfaction, to loyalty) that contains latent variables. 
There are two popular methods for estimating models of this type, partial least squares (PLS) and 
covariance structure analysis. The methods are more complementary than competing. Their use should 
depend on both the purpose of the analysis and the research context. For example, because the aim of 
covariance structure analysis is to explain relationships, and it is based on maximum likelihood 
estimation, it is particularly well suited to evaluating the relative fit of competing theoretical models. 
 
Yet there are frequent occasions in marketing research when PLS is the preferred method. PLS is 
essentially an iterative estimation procedure that integrates principal-components analysis with multiple 
regression. Whereas CSA explains covariance, the objective of PLS is to explain variance in the 
endogenous variables in a satisfaction model that have bottom-line managerial relevance (satisfaction, 
loyalty, profit). Because the latent variables in PLS are easily operationalized as principal components or 
weighted indices of the measurement variables, they provide managers with explicit benchmarks for 
evaluating their performance. When this performance information is combined with the impact scores 
from the regression estimates, managers have both the impact and performance information that they 
need to make key resource allocation decisions. 
Bagozzi/Yi (1994) delineate three contextual factors that also influence the choice of method. They 
argue that PLS is preferred over CSA when: (1) sample sizes are small, (2) the data to be analyzed is not 
multivariate normal (as when distributions are highly skewed), and (3) improper or non-convergent 
results are likely (as when estimating a complex model with many variables and parameters). Consider 
that satisfaction models often use small samples, especially at the segment level. Quality and customer 
satisfaction data is also marked by large negative skewness. And satisfaction models are often large and 
complex, involving multiple abstract benefits and dozens of attributes. 
 
These arguments illustrate why researchers prefer PLS when operationalizing an existing structural 
model, such as a company’s existing customer satisfaction model. PLS is, for example, used to estimate 
all of the major national satisfaction index models. PLS also has its disadvantages. One is that PLS tends 
to underestimate path coefficients and overestimate loadings. As Bagozzi and Yi argue, however, this 
means that the significant results of a PLS analysis can be given more credence because the test is more 
conservative. Other limitations of PLS are that jackknife or bootstrap procedures are needed to obtain 
estimates for the standard errors of the parameter estimates and, because PLS is a limited-information 
estimation method, its estimates are not as efficient as full-information estimates. Overall, however, 
there are clear reasons for integrating the advantage of PLS with the advantages of the finite mixture 
approach in a satisfaction context. 
 
When estimating structural equation models, researchers frequently treat data as if it were collected 
from a single population. This is unlikely to be the norm in customer satisfaction research. In 
multidimensional expectancy value models, customers from different market segments can have very 
different belief structures. Thus the impact that different drivers have on satisfaction, and their level of 
performance, likely varies from segment to segment. 
 
Typically, heterogeneity in structural equation models has been addressed by assuming that consumers 
can be assigned to segments a priori of the basis of demographic variables, usage levels, or other proxies 
for the underlying segments. A limitation of the a priori approach is that heterogeneity is often not 
captured adequately by well-known observable variables. Jedidi/Jagpal/DeSarbo (1997a, 1997b) 
propose a new approach based on CSA where heterogeneous groups are identified simultaneously with 
the structural equation model using a finite mixture framework. Arminger/Stein (1997) propose a more 
general method based on covariance structure estimates. 
 
An alternative is to develop a hierarchical Bayesian methodology for treating heterogeneity in structural 
equation models. An important advantage of this methodology is that it automatically provides 
individual-specific estimates of model parameters and factor scores. This is an interesting source for 
marketing managers who want to implement a relationship-marketing concept based on individual 
customer-to-supplier relationships. However, the researcher requires some meaningful, a priori 
information about the parameters and more than one observation from at least some individuals. 
 
Again, our goal is to combine the advantages of PLS with the finite mixture approach. Figure 1 provides a 
taxonomy of methods that seek to capture heterogeneity in structural equation models and shows 
where our proposed approach fits in the taxonomy. Substantively, the approach allows the marketing 
manager to perform response-based market segmentation where all consumers or customers in a 
segment are homogeneous in terms of the model’s path coefficients. As we will show, the approach 
complements a priori segmentation by capturing heterogeneity within existing, well-known segments. 
Methodologically, the approach contributes to marketing research by allowing researchers to detect 
unobservable, discrete moderating factors that account for heterogeneity among consumers advantages 
of predicting path coefficients, using PLS, with the maximum likelihood estimation of a finite mixture 
model. Conceptually the approach expands the a priori segmentation methods to prediction-oriented 
structural equation models. In contrast to the existing approaches of Jedidi et al. (1997a, 1997b, and 
1996) and Ansari et al. (2000) our approach is more management oriented as we can consider either 
formative and reflective measures in our model. In addition if all exogenous variables of the inner model 
are assumed to be formative and the endogenous are assumed to be reflective a simulation 
environment is given.  
 
 
 
3 The finite mixture partial least squares approach 
Wold (1966) originally developed the PLS approach as an algorithm for least squares (LS) estimation of 
path models with latent variables. Each latent variable (LV) is indirectly observed by a block of manifest 
variables (MVs). PLS predicts the linear conditional expectation relationship between dependent and 
independent variables. As this approach is based on predictor specification, PLS is quite different from 
covariance structure analysis like LISREL that focuses on a causality concept based on accounting for 
covariances. 
(2A path model with latent variables (structural equation model) consists of an inner model (inner 
relations, structural model, substantive part) and an outer model. The inner model depict the 
relationships among the latent variables as posited by substantive theory. 
Let 
𝛪  =   the subject (observation, individual) 𝑖 with 𝑖 − 1, … , N; 
𝜂i = the vector of the endogenous variables in the inner model for subject 𝑖; 
𝜉𝑖  =  the vector of the exogenous variables in the inner model for subject I,  
The inner relations can be expressed by: 
B𝜂i+Γ𝜉𝑖= 𝜁𝑖                                                                                                                                                                 (1) 
where B(Q × Q) and Γ(Q × P) are path coefficient matrices with Q = number of endogenous variables, 
P = number of exogenous variables, and 𝜁′𝑖  is a random vector of residuals
23. 
Outer relations define the relationships between the manifest variables (indicators) and the latent 
variables (components). Two kinds of outer relationships can be specified: reflective and formative. PLS 
allows for either type of relationship. 
Let 
𝑥𝑖   = the vector of observed measures for the exogenous LV for subject 𝑖; 
 𝑦𝑖 = the vector of observed measures for the endogenous LV for subject 𝑖.  
The outer relations for the reflective (outward) model can be expressed by: 
𝑦 =  Ʌ𝑦𝜂 + 𝜀𝑦                                                                                                                                                            (2a)                     
𝑥 =  Ʌ𝑥𝜉 + 𝜀𝑥                                                                                                                                                            (2b) 
where 𝑦 =  Ʌ𝑦(K × Q) and Ʌ𝑥(P × L) are the matrices of loadings that relate the latent variables to 
their measures where K is the number of indicators for endogenous variables, L is the number of 
indicators for exogenous variables and the 𝜀’s are the residuals and usually interpreted as 
measurement errors or noise.  
In the formative case, the relationships between the LVs and their indicators are defined as: 
𝜂 = 𝜋𝑛𝑦 + 𝛿𝑛                                                                                                                                                             (3a) 
𝜉 = 𝜋𝜁𝑥 + 𝛿𝜉                                                                                                                                                               (3b) 
 
 
where the π’s are the multiple regression coefficients and δ’s are the residuals from regressions 
The usual PLS algorithm predicts Β, Γ, the Ʌ’s and the π’s with an iterative scheme of partial least 
squares and calculates the scores of 𝜂 and ξ for ever individual. 𝜂 and ξ are multivariate normal 
distributed24. The result is an aggregate predictor specification based on the constraints of Β and Γ 
for the whole sample. 
Conceptually, heterogeneity in a satisfaction model is concentrated in the path coefficients that 
relate quality factors and price to satisfaction and subsequent loyalty25. The proposed model is an 
approach to capture the heterogeneity. It assumes that 𝜂i is distributed as a finite mixture of 
conditional multivariate normal densities26, 𝑓𝑖⃓𝑘(•): 
𝐾 𝐾 ǀ Β𝑘 ǀ 1𝜂 −1𝑖~ ∑𝑘=1 𝜌𝑘𝑓𝑘(𝜂𝑖ǀ𝜉𝑖 , Β𝑘 , Γ𝑘 , ψ𝑘) =  ∑𝑘=1 𝜌𝑘 [ 𝑄/2 1/2 exp (− (Β𝑘𝜂𝑖 + Γ𝑘𝜉𝑖)′ψ𝑘 (Β𝑘𝜂𝑖 + Γ𝑘𝜉𝑖))] (2𝜋) ǀ ψ𝑘ǀ 2
(4) 
where: 
k   = 1,…, K  latent classes; 
m  = 1,…, Q  number of endogenous variables; 
j    = 1,…, P  number of exogenous variables; 
Β𝑘 = ((𝛽𝑟𝑚𝑘)), the (Q × Q) matrix of endogenous variables coefficients for latent class k(r = 1,…,Q); 
Γ𝑘 = ((𝛾𝑚𝑗𝑘)) , 𝑡ℎ𝑒 (𝑄 × P) matrix of exogenous variables coefficients for latent class 𝑘; 
ψ𝑘 = the (Q × Q) matrix with the variances for each regression of the inner model on the diagonal 
and zero else; 
𝜌 = (𝜌1, … , 𝜌𝐾), a vector of the K  mixing proportions of the finite mixture (of which K – 1 are 
independent) such that 𝜌𝑘 > 0 and  ∑𝐾𝑘=1 𝜌𝑘 = 1. 
Suppose, the 𝜂𝑖  vectors are independent, the likelihood function for the N vectors (𝜂𝑖 ,…,𝜂𝑁) is given 
by: 
𝑁 𝐾
ǀ Β𝑘  ǀ 1
𝐿 = ∏ [∑ 𝜌 −1𝑘 [ exp (− (Β𝑘𝜂𝑖 + Γ𝑘𝜉𝑖)′ψ𝑘 (Β𝑘𝜂𝑖 + Γ𝑘𝜉𝑖))]]  (2𝜋)𝑄/2ǀ ψ ǀ1/2𝑘 2
𝑖=1 𝑘=1
(5) 
The mixing proportions 𝜌 can be construed as prior probabilities of any subject belonging to the K 
latent classes. The posterior probability of membership for subject i in class k (?̂?𝑖𝑘) can be  
computed using Bayes’ theorem, conditional on the estimates of the class-specific parameters 
?̂?𝑘1Β̂k1Γ̂k1ψ̂𝑘 via: 
?̂?𝑘𝑓𝑡ǀ𝑘(𝜂?̂? =  𝑖
ǀ 𝜉𝑖1Β̂k1Γ̂k1ψ̂𝑘 )
𝑖𝑘 𝐾 .                                                                                                                                   (6) ∑𝑘=1 ?̂?𝑘𝑓𝑖ǀ𝑘(𝜂𝑖ǀ𝜉𝑖,Β𝑘,Γ𝑘,ψ𝑘)
 
3.1 Identification of the Model 
Mixtures of multivariate normal densities are typically identified27, but the model specified in 
Equation 4 is not identified if all elements in Γ= (Γ1,…, Γ𝐾). B = (B1,…,B𝐾) and ψ =
(ψ1, … , ψK) are free. Identification in this context requires placing restrictions on model 
parameters. The most common restrictions set some elements of Γ, Β, and ψ to zero or some other 
constant, whereas others entail the imposition of equality of inequality constraints on paramers28. 
In our model the parameters of the diagonal of ψ are free only. The off-diagonal parameters are 
constrained to zero. The free parameters of Γ, Β are conditional to the inner model. Only the 
parameters for specifying the inner model are free, whereas the other parameters are constrained 
to zero. 
3.2 Estimation of the Model via the EM-Algorithm 
The likelihood of the model developed in the previous section can be maximized using the EM 
algorithm29. The algorithm contains an expectation part (E-step) and a maximization part (M-step). 
It should be mentioned that another optimization routine such as Newton-Raphson or Fletcher-
Reeves could be used to maximize the likelihood-function. Convergence is not ensured with the two 
latter methods. The EM algorithm is attractive because it can be programmed easily and 
convergence is ensured. The estimation procedure can be described as a two-stage process. In the 
first stage a PLS solution is estimated based on the aggregate sample with the aim of obtaining 
predictor scores for the latent variables, 𝜂 and ξ, for each respondent individually. In the second 
stage the predicted scores of the latent variables are used as dependent and independent variables 
for a set of regressions of the inner model, defined by the constraints of B and Γ respectively. Every 
endogenous variable reflects a regressant of an OLS regression, whereas the regressors come from 
a subset of endogenous and exogenous variables. All regression equations are computed 
independently according to the PLS assumption. Consequently the matrix for latent class k, ψk, is a 
diagonal matrix with the variances of the partial regressions on the diagonal. Our segmentation 
approach relaxes the second stage by implementing a finite mixture model with this set of 
regression equations. The modification of the M-step is described later. 
In order to present an EM formulation, we introduce nonobserved data via the indicator function: 
𝑧𝑖𝑘
= 1 if subject 𝑖 belongs to class 𝑘,                                                                                                                                                                     
= 0 otherwise.  
We assume that the nonobserved data in the vector 𝑧1 = (𝑧1ǀ, …,ziK) are independently and 
identically multinomially distributed with probabilities 𝜌𝑘. The joint likelihood of 𝜂𝑖  and 𝑧𝑖  is  
𝐿𝑖(𝜂𝑖 , 𝑧𝑖, 𝜉
z
𝑖 , Β𝑘 , Γ𝑘 , ψ𝑘 , 𝜌𝑘) = ∏𝑘[𝜂𝑖𝑓( 𝜂𝑖 , ǀ𝜉𝑖 , Β𝑘 , Γ𝑘 , ψ𝑘)]  .                                                                               (7) 
The complete likelihood over all subjects is 
𝐿 =  ∏𝑖 ∏𝑘[𝜉
𝑧
𝑖 , Β𝑘 , Γ𝑘 , ψ𝑘)]                                                                                                                                        (8) 
And the log-likelihood is  
In 𝐿 =  ∑𝑖 ∑𝑘 𝑧𝑖𝑘 ln(𝑓( 𝜂𝑖 , ǀ𝜉𝑖 , Β𝑘 , Γ𝑘 , ψ𝑘)) + ∑𝑖 ∑𝑘 𝑧𝑖𝑘 ln𝜌𝑘 .                                                                            (9) 
The matrix Z = (𝑧1, … , 𝑧𝑡) is considered as missing data. 
The EM-algorithm starts with an E-step, where the expectation of lnL is evaluated over the 
conditional distribution of the nonobserved data Z given the predicted values of 𝜂𝑖  and 𝜉𝑖  of the 
observed data 𝑦1 and 𝑥2, and the provisional estimates (B
∗, Γ∗, ψ∗, and 𝜌∗) of the parameters Β, Γ, ψ, 
and 𝜌 respectively. These estimates can be calculated from a random sample of membership 
probabilities of 𝑃𝑖𝑘 or can be set from the analyst based on assumptions and/or prior knowledge 
about the classes and the coefficients. 
The expectation of the likelihood function is  
E lnL; 𝜉𝑖 𝜌 = 𝜌
∗, B = B∗, Γ = Γ∗, ψ = ψ∗) 
=∑𝑖 ∑
∗
𝑘 𝐸(𝑧𝑖𝑘; 𝜉𝑖, 𝜌 , B
∗, Γ∗, ψ∗ǀ𝜂𝑖)ln( 𝑓(𝜂𝑖ǀ𝜉𝑖, 𝜌
∗ , B∗ Γ∗𝑘 𝑘, 𝑘 , ψ
∗
𝑘))                                                                        (10) 
+∑ ∑ 𝐸(𝜉  , 𝜌∗ ∗ ∗ ∗ ∗𝑖 𝑘 𝑖 , B , Γ , ψ ǀ𝜂𝑖)ln𝜌𝑘 . 
The conditional expectation of 𝑧𝑖𝑘  can be calculated as 
E(𝑧𝑖𝑘;ξ, 𝜌 = 𝜌
∗, B = B∗, Γ = Γ∗, ψ = ψ∗) 
= 𝜌∗𝑘𝑓(𝜂𝑖ǀ𝜉
∗ ∗
𝑖 , B𝑘,Γ𝑘 , ψ
∗
𝑘)/ ∑𝑘 𝜌
∗
𝑘 𝑓(𝜂𝑖ǀ𝜉𝑖, B
∗
𝑘,Γ
∗, ψ∗𝑘 𝑘).                                                                                           (11) 
Comparing (11) with (6) reveals that the posterior membership probability 𝑃∗𝑖𝑘 for subject i in class 
k evaluated with provisional estimates is 
𝑃∗𝑖𝑘 = 𝐸(𝑧𝑖𝑘; 𝜉𝑖, B
∗, Γ∗, ψ∗ǀ𝜂𝑖).                                                                                                                                 (12) 
The nonobserved data in matrix Z are replaced by the posterior probabilities calculated on the base 
of provisional estimates. Thus equation (10) becomes 
E (lnL; 𝜉𝑖 𝜌 = 𝜌
∗, B = B∗, Γ = Γ∗, ψ = ψ∗) 
= ∑𝑖 ∑ 𝑃
∗
𝑘 𝑖𝑘ln(𝑓(𝜂 ǀ𝜉 , 𝜌
∗ , B∗𝑖 𝑖 𝑘 𝑘,Γ
∗
𝑘 , ψ
∗
𝑘))                                                                                                                  (13)   
+ ∑𝑖 ∑𝑘 𝑃
∗ ∗
𝑖𝑘ln𝜌𝑘. 
In the M-step we maximize equation (9) with respect to the parameters subject to the restriction 𝜌𝑘 
> 0 and ∑𝑘 𝜌𝑘 > 1, conditional on the new provisional estimates of 𝑧𝑖𝑘   in order to obtain revised 
parameter estimates. These revised estimates are then used in the subsequent E-step to calculate 
new estimates of 𝜌𝑖𝑘. These estimates are used as expectations of 𝑧𝑖𝑘  in the next M-step to get new 
estimates of the parameters and so forth. 
In our approach the M-step contains a number of independent OLS regressions, one for each 
regression in the inner model. The regressions of the inner model reveal the relationships between 
the m endogenous variables (as dependent variables) and the exogenous and endogenous variables 
(as independent variables) of the model. The relationships are defined via Β and Γ. Tus for each 
endogenous variable as a dependent variable an OLS regression is calculated in the M-step. We use 
the Maximum Likelihood Estimator of the coefficient and the variance, that is identical to the Lease 
Squares Prediction in the OLS case. 
Let 
m  = number of independent regressions in the inner model; 
𝐴𝑚 = number of exogenous variables as regressors in regression 𝑚; 
𝐵𝑚 = number of endogenous variables as regressors in regression 𝑚; 
Y𝑚𝑖 = the value of the reggressor (𝐴𝑚 + 𝐵𝑚  × 1)-vector for regression 𝑚 of individual i. 
We obtain the parameters of the regression for endogenous variable 𝑚 with 
Y𝑚𝑖 =  𝜂𝑚𝑖 
𝑋𝑚𝑖 =  𝐸𝑚𝑖,𝐸𝑚𝑖)′ 
where 
E𝑚𝑖 = {(𝜉𝑖, … , 𝜉𝐴 )𝑖𝑓 𝐴𝑚 >= 1, 𝑎𝑚 = 1, … , 𝐴𝑚 𝑎𝑛𝑑 𝜉𝑎  𝑖𝑠 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑜𝑟 𝑜𝑓 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑚 ( ) 𝑒𝑙𝑠𝑒. 𝑚 𝑚
N𝑚𝑖 = {(𝜂𝑖, … , 𝜂𝐵 )𝑖𝑓 𝐵 >= 1, 𝑏 = 1, … , 𝐵𝑚 𝑚 𝑚 𝑚 𝑎𝑛𝑑 𝜂𝑏  𝑖𝑠 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑜𝑟 𝑜𝑓 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑚 ( ) 𝑒𝑙𝑠𝑒. 𝑚
and the close-form OLS analytic expressions for and 𝜏𝑚𝑘 and 𝜔𝑚𝑘 
𝜏𝑚𝑘 = [∑𝑖 𝑃𝑖𝑘(𝑋
′ −1
𝑚𝑖𝑋𝑚𝑖)] [∑𝑖 𝑃𝑖𝑘(𝑋
′
𝑚𝑖𝑌𝑚𝑖)],                                                                                                     (14) 
with  
𝜏𝑚𝑘= (((𝛽𝑎 𝑚𝑘)),(( 𝑦𝑚 𝑏𝑚𝑚𝑘)))′. 
and  
𝜔𝑚𝑘 = [∑𝑖 𝑃𝑖𝑘(Y𝑚𝑖 −  𝑋𝑚𝑖𝜏𝑚𝑘)(Y𝑚𝑖 − X𝑚𝑖𝜏𝑚𝑘)′/I𝜌𝑘]                                                                                    (15) 
with 
𝜔𝑚𝑘 = 𝑐𝑒𝑙𝑙 (𝑚 × 𝑚) 𝑜𝑓 ψ𝑘 . 
The result of each independent regression serves as a new provisional estimate for the next E-step 
iteration of the EM-algorithm. The E-step and the M-step are successively applied until no further 
improvement in the ln-likelihood-function is possible based on a pre-specified convergence 
criterion. Hence, although convergence to at least a locally optimum solution is guaranteed, 
different starting values of the parameters must be used to investigate the potential occurrence of 
the local optimum.  
3.3 MODEL SELECTION 
When applying the above model to data, the actual number of classes K is unknown and must be 
inferred form data. The problem of identifying the number of classes is still without a satisfactory 
statistical solution30. The likelihood ratio test statistic for example is not valid in a mixture model, 
because it is not asymptotically distributed as chi-square. Bozdogan/Sclove (1984) propose using 
Akaike’s (1974) Information Criteria (AIC) for determining the number of classes in a mixture 
model: 
𝐴𝐼𝐶𝐾 = −2 ln 𝐿 + c𝑁𝐾 ,                                                                                                                                             (16) 
with c = 2 is a constant and 𝑁𝑘  is the number of free parameters: 
𝑁𝐾 = (K -1) + KR + KQ.                                                                                                                                          (17) 
R is the number of predictor variables in all regressions of the inner model. The constant c in AIC 
imposes a penalty on the likelihood, which weighs the increase in fit (more parameters yield a 
higher likelihood) against the additional number of parameters estimated31. 
Two criteria penalize the likelihood more heavily: Schwarz’s (1978) Bayesian information criteria 
𝐵𝐼𝐶𝐾 = -2lnL+ln𝐼𝑁𝐾 ,                                                                                                                                               (18) 
where c = lnI and the consistent Akaike information criteria computed as 
𝐶𝐴𝐼𝐶𝐾 = -2lnL+ln(I+1)𝑁𝐾 ,                                                                                                                                   (19) 
where c = (lnI +1). However all measures discussed above are heuristics for model selection. To 
assess the separation of the segments, an entropy statistic32 can be used to investigate the degree of 
separation in the estimated individual class probabilities. 
𝐸𝑁𝐾 = 1 − [∑ ∑ −𝑃𝑖𝑘ln𝑃𝑖𝑘]/𝐼 ln 𝐾. 
𝑖 𝑘
𝐸𝑁𝐾  is a relative measure and is bounded between 0 and 1. Values close to 1 indicate that the 
derived classes are well separated. In addition the entropy measure indicated whether a solution is 
interpretable or not. For example, a solution with goods heuristics and a bad entropy measure, say 
𝐸𝑁𝐾  = 0, can not be interpreted accurately. The segments are “fuzzy,” which means that only ‘parts’ 
of the subjects belong to a class. The fuzziness of any derived class memberships makes the 
managerial implications equally “fuzzy”. 
4 Empirical application 
We illustrate the Finite Mixture PLS approach using a national survey of customers’ perceived quality, 
satisfaction and loyalty with convenience stores. The survey was sponsored by the National Association 
of Convenience Stores (NACS) and based on a representative cross-section of convenience store 
customers and stores in the United States. The interview methodology (computer aided telephone 
interviews) and random sampling procedure was the same as that used for the American Customer 
Satisfaction Index (ACSI) survey. The data were collected in December of 1998, and the sample included 
1,025 customers who were selected to be representative of the demographic profile of convenience 
store consumers. In terms of demographics, 42.4% were male and 57.6% were female. The age range 
was from 18-81 with a median age of 37. The sample was also broadly distributed across income and 
education levels. 
The NACS satisfaction model is presented in Figure 2. The Figure shows nine latent variables, benefits or 
consequences that are immediate antecedents of satisfaction in the model. Satisfaction, in turn, affects 
customer loyalty, both of which are also latent variables in the model. Each latent variable is 
operationalized using multiple proxies or survey measures rated on 10-point scales (see Table T). The 
satisfaction measures are the same as those used in the ACSI survey, while the loyalty measures are 
rated likelihoods of revisiting the convenience store and recommending it to others. A correlation matrix 
of the measurement variables is shown in Table 2. 
The measures of satisfaction and loyalty is assumed to be reflective whereas the rest of the measures 
are assumed to be formative. This is derived by the assumption of the ASCI34. 
4.1 Aggregate results 
The aggregate PLS results are shown in Table 3. The values of the inner model calculated with PLS are 
equal to these of our new approach with K = 1. According to the aggregate path coefficients, the largest 
drivers of satisfaction are perceived safety (0.193), store layout (0.174), prices (0.168) and separate take 
out (0.152). The smallest drivers of satisfaction are products (0.039) and motorist sendees (0.039). This 
suggests that, on the whole, both products and gas services are relatively undifferentiated across 
convenience stores. Satisfaction has a large and significant impact on loyalty in the model (0.625). 
The aggregate results for K = 1 provide important benchmark values for the goodness of fit measures 
AIC, BIC and CAIC, which are shown in the first row of Table 4. The next section discusses the 
constraining of parameters and results for K greater than one. 
4.2 Disaggregate finite mixture results 
We applied our new approach for a varying number of classes K. The impacts should be greater or equal 
to zero (for both the aggregate and disaggregate models) as all of the survey questions are valenced in 
the same direction (higher values are more attractive, such as higher quality or more attractive prices). If 
we assume that all drivers of satisfaction are independent of each other, the impacts can be interpreted 
as the increase in satisfaction that results from an increase in any particular price or quality driver. Our 
initial disaggregate solutions contained some small, negative coefficients for the drivers. With the 
assumption of independence, an interpretation would be difficult. For example, a negative impact of 
service to satisfaction would mean that higher quality service lowers overall satisfaction. To prevent 
such non-interpretable solutions, we constrained our coefficients to be equal or higher than zero. 
Consequently we only obtain local but interpretable optima for our coefficients. In addition, we used the 
constrained solution which has the minimized values in AIC, BIC and CAIC. 
 
Table 4 shows the goodness of fit statistics for model selection (ln-likelihood (LnL), AIC, BIC, CAIC) and 
the entropy measurement (EN) described above for K = 1, 2, 3, 4, 5 and 6. In our application the model 
heuristics are not in contrast to each other. If so the more conservative BIC should be preferred. 
In the last section we mentioned the problems associated with locally optimum solutions. Therefore, we 
calculated all class options (2 through 6) ten times with different random starting values to be sure that 
we find a global maximum. For K= 2 the algorithm always found the same solution. For K = 3 we 
obtained different solutions close together, but the solution with the values shown in Table 4 was the 
one with the optimal goodness of fit measures (the smallest values for AIC, BIC and CAIC). The solutions 
for K = 4, K = 5 and K = 6 showed greater variance. It shows that finding a global maximum becomes 
more difficult if we are looking for solutions with higher number of classes K. A reason for this 
phenomenon is the high-dimensional solution space for the local and global maxima. The success of 
iterative processes like the EM-algorithm depends on a good (plausible) set of starting values. As we do 
not know any good (or better plausible) sets of starting values, we have to increase the number of 
alternative solutions. Therefore for K = 4, K = 5 and K = 6 we started the approach twenty instead of ten 
times. Table 4 shows the solution with the minimized AIC, BIC and CAIC for K= 4, K = 5 and K= 6. 
The R2-value of the aggregated version is 0.63. The R2-value of the K = 5 classes solution is 0.88. This 
indicates that the explained variance has really been improved by going from one to five segments. 
 
 
 
 
 
 
 
It is important to emphasize that the approach can be used in either an explorative or confirmative 
fashion. If the researcher knows any a priori information about the real values of the model, he or she 
can integrate this information when setting the starting values. If the algorithm finds a solution that 
corresponds to or is very similar to the starting values, it is evident that the prior information is a good 
start. In contrast, if the researcher wants to find new information about his or her model a number of 
different starting values are used to be sure that the result is not a local optimum in the high 
dimensional solution space. We focus on the 5-class solution where the AIC, BIC and CAIC measures are 
minimized. This solution also has the best entropy measure among the K = 2 through 6 solutions (EN = 
0.43). Note that the selection of the most interpretable solution is the same as for the unconstrained 
case mentioned earlier: the 5-class solution. 
Table 5 presents the path coefficients (impacts) for each of the 5 classes, where each class represents a 
relatively homogeneous group of customers. Going forward, we refer to these classes as market 
segments. Segments one through five comprise 10.7%, 36.8%, 17.2%, 27.7% and 7.6% of the overall 
survey population respectively. For segment one, satisfaction is almost synonymous with safety, which 
has an impact of 0.984. The next highest and only other significant driver is cleanliness with an impact of 
0.291. Segment two, the largest segment, is more balanced in that service, prices, cleanliness, 
convenience and safety all have significant impacts (ranging from 0.185 to 0.260). This segment also 
shows the largest impact of satisfaction on loyalty (0.863). Segment three is quite different from either 
of the previous two segments, as store layout and separate take out are the main drivers with impacts 
of 0.495 and 0.485. Segment four, which is the second largest overall, is the most price sensitive 
segment where impact of price is 0.210. Store layout and separate take out also have significant impacts 
for this segment. Segment five is marked by the importance placed on store layout and convenience, 
with impacts of 0.595 and 0.312 respectively. These shoppers want to find what they need and get in 
and out of the store quickly. 
A membership probability is calculated for each customer in each segment. The entropy measure EN = 
.43 for K = 5 gives an aggregate value of how strongly customers belong to one particular segment. 
However the entropy measure gives no idea as to just what that means for each segment. For example, 
on the one hand, a customer can belong to four different segments with a membership probability of 
say .10 and to one segment with .60. On the other hand, a customer can belong to each segment with a 
membership probability of .20. In addition, the differences of the entropy statistics between the 
solutions are non-impressive. Therefore a more detailed investigation of the membership probabilities 
of the five-segment solution should be useful. 
Table 6 shows the number of customers who belong to a segment with a membership probability higher 
than .80, .60, .50 and .40 respectively. In our application only 130 customers belong to one segment 
with a membership probability higher than .80. This is about 13% of the whole sample. Ideally, 
membership probabilities should be as unique as possible for one specific segment, hence the 
probability should be near 1. But in reality, the lower membership probabilities illustrate the complexity 
of measuring response-based variables. Table 6 shows that 686 customers out of 1,025 belong to one 
segment with a probability higher than .50. This means that our 5-segment solution is a fairly good 
approximation for grouping 1,025 different individuals together into 5 segments. 
 
 
 
 
4.3 Post hoc analyses of the segments 
To augment our interpretation of the segment-level results, we conducted post hoc analyses of the 
posterior probabilities of membership based on a model from Ramaswamy et al. (1993): 
 
𝑄𝑖𝑘 = ∑𝑢 𝑍𝑖𝑢 𝛿𝑢𝑘 + 𝑣𝑖𝑘 ,                                                                                                                                           (21) 
with 
𝑄𝑖𝑘 = ln(𝑃𝑖𝑘/𝑃𝑖), 
𝑃𝑖=(∏𝑘 𝑃𝑖𝑘)
1/𝐾  as the geometric mean of the posterior probabilities, 
𝑍𝑖𝑢      as the value of descriptive variable u for individual I, 
𝛿𝑢𝑘     as the impact coefficient for variable u for segment k, 
𝑣𝑖𝑘      as a random normal disturbance variable. 
 
The descriptive variables in our study, collected as part of the convenience store survey, are: gender 
(male/female), age (in years), number of household members (5 categories), user frequency (daily, 
weekly, occasional user), education (3 categories), income (3 categories), 7-11 store user (Yes/No), 
neighborhood store user (Yes/No). The 7-11 brand was by far the most frequently measured 
convenience store in the sample, hence its use as a descriptive variable. Also common were 
neighborhood store users who, when asked “At which convenience store do you shop most often?” they 
respond with a store name that is not part of a franchise system. This variable picks up the unique 
nature of “Mom and Pop” stores (typically family owned) that make up a large proportion of the 
industry. Table 7 shows the impact coefficients from the post hoc analysis of our 5-segment solution. 
Overall there are relatively few significant descriptors for the five segments. One exception is gender, 
which is significantly related to segments two through five. For segment one, which is the safety 
conscious segment, household size is the largest descriptor. The larger the family, the more concern 
there is over safety. This is logical as larger families have more children who run errands or meet friends 
at convenience stores. Segment two, which had the most significant drivers of satisfaction (dominated 
by cleanliness), are primarily females who visited 7-11 stores. Segment three shoppers, where store 
layout and separate take out food are the dominant drivers, are primarily females who where not 
weekly shoppers. Segment four, the price conscious segment, is marked only by the fact that it is more 
female. In contrast, the store layout and convenience segment (the “get me in and out quickly” 
segment) is predominantly male. 
Clearly, our analysis demonstrates that the results of an aggregate satisfaction model can be very 
misleading. Aggregate analysis hides the existence of meaningful subset of customers that are more 
homogeneous in their satisfaction drivers. While some customers are dominantly concerned with safety, 
other customers’ satisfaction is the result of convenience or price. It is also clear from our post hoc 
analysis that the segments can not be clearly identified using simple descriptive variables. This is natural, 
as segments do not exist at the level of descriptive variables but rather at the level of benefits, 
consequences and needs35. Yet marketing managers often require such variables to derive market 
action implication. Gender is clearly one variable that helps to differentiate at least one segment. User 
frequency could also be such a variable. Managers in the convenience store industry pay particular 
attention to daily, weekly and occasional users and how their needs differ. In the next section of the 
paper, we use these a priori groupings before the customer satisfaction model is calculated. 
4.4 Disaggregate PLS results: A priori segmentation 
Table 8 shows the PLS results for the a priori segmentation based on the daily, weekly and occasional 
user segments (n = 265, 300 and 436 respectively). The solutions show that, in each segment, there are 
five to seven significant drivers of satisfaction, none of which have particularly large impacts. Some 
notable differences are the increased importance of separate take out for daily users (who likely obtain 
more of their meals from the stores), the importance of service to weekly customers, and the 
importance of products to occasional users. 
But the pattern of results for each segment is similar to what we found for the aggregate sample. The  
 
 
finite mixture-based segments show much more pronounced differences in satisfaction drivers across 
segments. This suggests that, while the user frequency groups are homogeneous with respect to usage, 
they are still quite heterogeneous in their satisfaction drivers. The solution for each group is still an 
aggregate of different coefficients for the drivers of satisfaction. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4.5 Disaggregate PLS results: A priori and Finite Mixture PLS segmentation 
To illustrate this heterogeneity and show how the Finite Mixture PLS approach provides insight to an 
existing, a priori segmentation scheme, we applied the new approach to the daily users segment. This 
“heavy user” group is of obvious importance to convenience stores and a major focus of their marketing 
activity. But not all daily users are necessarily looking for the same things from their convenience store. 
When we applied our new approach to the daily users, a two-segment solution emerged based on 
minimal values for the ln-likelihood, AIC, BIC and CAIC statistics (entropy = 0.60). The results are shown 
in Table 9. 
Satisfaction for Segment 1 (K = 1, 22.8% of the sample) is driven dominantly by store layout and 
separate take out food, followed by motorist services. These “daily shoppers” fill their grocery baskets, 
stomachs and vehicles at their local convenience stores. They also appreciate high quality service. In 
contrast, segment 2 customers (K = 2, 77.2% of the sample) are more sensitive to safety, prices and 
cleanliness. These “daily stoppers” seem to stop to get just what they need. Satisfaction also has much 
more impact on loyalty for the segment 1 “shoppers” (0.823) than for the segment 2 “stoppers” (0.460). 
We applied our posthoc analysis approach described above to the two-segment solution one main 
difference emerged. Segment 1 customers are significantly more likely to shop at 7-11 stores. 
5 Discussion and conclusions 
An emergent solution to capturing heterogeneity in market response is to use a latent class approach 
such as a finite mixture model. But whereas the latent class methods are based on maximum likelihood 
estimation, the operationalization of a satisfaction model often necessitates a least squares-based 
procedure. As an SEM methodology, PLS (partial least squares) is particularly well suited to estimating 
and operationalizing satisfaction models in practice. PLS accommodates the skewed data and small 
sample sizes common in satisfaction research and, compared to other techniques, it is less prone to 
non-convergent or improper solutions. For managers, the performance scores and impacts that emerge 
from PLS analysis provide the diagnostic and benchmark information required to set priorities for 
improvement. 
The goal of this article has been to merge the advantages of least square estimation, when estimating a 
satisfaction model, with the advantages of maximum likelihood estimation, when deriving market 
segments. Our Finite Mixture PLS approach is designed to capture heterogeneity in structural equation 
models that link quality and price drivers to satisfaction and subsequent loyalty. It empirically derives 
segments and directly estimates model relationships. The advantage of the approach compared to an a 
priori segmentation scheme is that the derived segments are homogenous in terms of model 
relationships. The approach calculates segment proportions, or the degree to which customers belong 
to particular segments, and the results can be statistically tested with goodness of fit measures. Thus 
the proposed Finite Mixture PLS model expands the existing Partial Least Squares approach to include 
one of the central issues in marketing theory and practice-segmentation. 
When we apply the Finite Mixture PLS analysis to a national survey of quality, satisfaction and loyalty for 
convenience store customers, it reveals significant heterogeneity. Our five-segment solution identifies 
clear differences among customers who, for example, either value safety, separate take out and store 
layout, or prices. Another interesting observation is that, when we conduct a post hoc analysis that 
related descriptive variables to the segments, relatively few significant predictors emerge. Exceptions 
include gender, household size and frequency of usage. This finding is consistent with the prevailing 
view in marketing that segments exist at the level of benefits, consequences and needs, while 
descriptive variables such as age, gender and frequency of use may be weak proxies36. To illustrate the 
problem, we analyzed a prominent a priori segment, daily users, using the Finite Mixture PLS approach. 
The results again reveal clear differences in both the drivers of satisfaction and the effect of satisfaction 
on loyalty. Whereas “daily shoppers” value store layout, separate take out food and motorist services, 
“daily stoppers” value safety, prices and cleanliness. 
Our findings reinforce an underlying premise in marketing that is often lost in practice, particularly in the 
practice of measuring and managing customer satisfaction. Satisfaction studies often rely on concrete, 
descriptive attributes of the product, service and customer segment. According to our findings and in 
line with means end theory, customers do not purchase a package of attributes, but rather a complex of 
benefits or even a set of values. And the benefit segments themselves are not easily described using 
traditional demographic variables. Applied satisfaction models should strive to capture both the abstract 
nature of satisfaction drivers and satisfaction-based market segments. Finite mixture-based segments 
that are built upon a latent variable modeling approach, such as PLS, can go a long way toward 
explaining variance in satisfaction judgements. They also help companies to draw more reasonable 
conclusions than those based on descriptive variables alone, such as frequency of usage. 
One limitation of the proposed approach (mentioned in Footnote 4) is that it does not consider 
interaction effects in the inner model. In addition, in following the standard assumptions of the PLS 
approach, we assume that the regressions of the inner model are independent. Future research should 
focus on these aspects and on large-scale simulation studies to test the Finite Mixture PLS method in 
different marketing applications where heterogeneity is present. 
Another avenue for further research is a more profound identification of market segments. The post hoc 
analysis seems to shed not too much light on demographically identifiable segments. Therefore these 
segmentation results are not really useful since management can’t truly identify who are differentially in 
each of the market segments. One reviewer suggested a concomitant variable approach to 
reparameterize the mixing proportions as direct functions of the demographics. A test could reveal 
which model may fit best. 
 
References 
 
Akaike, Hirotuyn (1974), A new look at statistical model identification, in: IEEE Transactions on 
 Automatic Control, Vol. 6, pp, 716-723. 
Ansari, Asim/Jedidi, Kamel/Jagpal, Harsharan S. (2000), A hierarchical Bayesian methodology for 
 treating heterogeneity in structural equation models, in: Marketing Science, Vol. 19, pp. 328-
 347. 
Arminger, Gerhard/Stein Petra (1997), Finite mixtures of covariance structure models with regressors, 
 in: Sociological Methods & Research, Vol. 26, pp. 148-182. 
Bagozzi, Richard P. (1982), A field investigation of causal relations among cognitions, affects, intentions, 
 and behavior, in: Journal of Marketing Research, Vol. 19, pp. 562-584. 
Bagozzi, Richard P.  (1994),  Structural  equation  models  in  marketing  research;  Basic  principles,  in: 
Bagozzi, Richard P. (ed.), Principles of Marketing Research, pp. 317-385. 
Bagozzi, Richard P./Yi,  Y. (1994), Advanced topics in structural equation models, in: Bagozzi, Richard 
 P. (ed.), Advanced Methods of Marketing Research, pp. 1-52. 
Best, Roger J. (2000), Market-based management: Strategies for growing customer value and 
 profitability. 
Bozdogan, Hamparsum/Sclove, Stanley L. (1984), Multi-sample cluster analysis using Akaike’s 
 information criterion, in: Annals of the Institute of Statistical Mathematics, Vol, 36, pp,  163-180. 
Brusco, Michael J/Cradit, J. Dennis/Stahl, Stephanie (2002), A Simulated Annealing Heuristic for a 
Bicriterion Partitioning Problem in Market Segmentation, in: Journal of Marketing Research, Vol. 39, pp. 
 99-109. 
Dempster, Arthur P/Laird, Nan M/Rubin, Donald B. (1977), Maximum likelihood from incomplete data 
 via the EM-algorithm, in: Journal of the Royal Statistical Society: Series B, Vol. 39, pp. 1-38. 
Diamantopoulos, Adamantios/Winkelhofer, Heide W, (2001), Index Construction with Formative 
 Indicators: An Alternative to Scale Development, in: Journal of Marketing Research, Vol. 38, pp. 
 269-277. 
Dillon, William R./White, John BJRao, Vithula R./Filak, Dong (1997), Good science: Use structural 
 equation models to decipher complex customer relationships, in: Marketing Research, Vol. 9, 
 pp. 22- 31. 
Fornell, Cleas (1987), A second generation of multivariate analysis: Classification of methods and 
 implications for marketing research, in: Houston, MichaelJ. (ed.), Review of Marketing 1987. 
Fornell, Cleas (1995), The Quality of Economic Output: Empirical Generalizations About Its Distribution 
 and Association to Market Share, in: Marketing Science, Vol. 14, G203-G211. 
Fornell, Cleas/Bookstein, Fred L. (1982), Two structural equation models: LISREL and PLS applied to 
 consumer exit-voice theory, in: Journal of Marketing Research, Vol. 14, pp. 440-452. 
Fornell, Cleas/Cha, Joe (1994), Partial least squares, in: Bagozzi, Richard P. (ed.), Advanced Methods of 
 Marketing Research, pp. 52-78. 
Fornell, Cleas/Johnson, Michael D/Anderson, Eugene WJCba, Joe/Bryant, Barbara E. (1996), The 
 American customer satisfaction index: Nature, purpose and  findings,  in: Journal  of Marketing,  
 Vol.  60, pp. 7-18. 
Gustafsson, Anders/Johnson, Michael D. (1997), Bridging the quality-satisfaction gap, in: Quality 
 Management Journal, Vol. 4, pp. 27-43. 
Hahn, Carsten H. (2002), Segmentspezifische Kundenzufriedenheitsanalyse. 
Jedidi, Kamel/Jagpal, Harshava S/DeSarbo, Wayne S. (1997a), Finite-mixture structural equation models 
 for response-based segmentation and unobserved heterogeneity, in: Marketing Science, Vol. 16, 
 39-59. 
Jedidi, Kamel/Jagpal, Harshava S/DeSarbo, Wayne S. (1997b), STEMM: A general finite mixture 
 structural equation model, in: Journal of Classification, Vol. 14, pp. 23-50. 
Jedidi,  Kamel/Ramaswamy,  Venkatram/DeSarbo,  Wayne S/Wedel, Michel (1996),  On  estimating finite 
 mixtures of multivariate regression and simultaneous equation models, in: Structural Equation  
 Modeling, Vol. 3, pp. 266-289. 
Johnson, Michael D/Gustafsson, Anders (2000), Improving customer satisfaction, loyalty and profit:  
 An integrated measurement and management system. 
Johnson, Michael D JGustafsson, Andeis/Andreassen, Tor W/Lervik, Line/Cha, Joe (2001), The evolution 
 and future of national customer satisfaction index models, in: Journal of Economic Psychology, 
 Vol. 22, pp. 217-245. 
Joreskog, Karl G. (1977), Structural equation models in the social sciences: Specification, estimation, and 
 testing, in: Krishnaiah Paruchuri K. (ed.), Applications of Statistics, pp. 265-287. 
Kamakura, Wagner A/Russell, Gary, (1989), A probabilistic choice model for market segmentation and 
 elasticity structure, in: Journal of Marketing Research, Vol. 26, pp. 379- 390. 
Kamakura, Wagner AJWedel, Michel/Agrawal, John (1994), Concomitant variable latent class models for 
 conjoint analysis, in: International Journal of Research in Marketing, Vol. 11, pp. 451-464. 
Manilla, John A/James, John C. (1977), Importance-performance analysis, in: Journal of Marketing, Vol. 
 41, pp. 77-79. 
McLachlan, Geoffrey J/Basford, Kaye E. (1988), Mixture models: Inference and applications to clustering. 
McLachlan, GeoffreyJJKrishnan, Triyan (1997), The EM-algorithm and extensions. 
Muthen, Bengt O. (1989), Latent variable modeling in heterogeneous populations, in: Psychometrika, 
 Vol. 54, pp. 557-585. 
Ramaswamy, Venkatram/DeSarbo, Wayne S/Reibstein, David J./Robinson, William T. (1993), An 
 empirical pooling for estimating marketing mix elasticities with PIMS data, in: Marketing 
 Science: Vol. 12, pp. 103-124. 
Schwarz, Gideon (1978), Estimating the dimension of a model, in: Annals of Statistics, Vol. 6, pp.  
 46l- 464. 
Steenkamp, Jan-Benedict E. M/Baumgartner, Hans (2000), On the use of structural equation models for 
 marketing modeling, in: International Journal of Research in Marketing, Vol. 17, pp. 195-202. 
Steenkamp, Jan-Benedict E. M /van Trijp, Hans C. M. (1996), Quality guidance: A consumer-based 
 approach to food quality improvement using partial least squares, in: European Review of 
 Agricultural Economics, Vol. 23, pp. 195-215. 
Wedel, Michel/Kamakura, Wagner A. (1999), Market Segmentation, Conceptual and Methodological  
 Foundations, Second Edition. 
White, Michael E. (1997), Customer Satisfaction for the Ann Arbor Soccer Referee Association,  
 Working Paper, University of Michigan Business School. 
Wold, Herman (1966), Estimation of principal components and related models by iterative least squares, 
 in: Krishnaiah, Parachuri R. (ed.), Multivariate analysis: Proceedings of an international 
 symposium held in Dayton, Ohio, pp. 391-420.