Latent Class models


Latent class (LC) models are increasingly used in choice analysis, and are particularly suitable to investigate the existence of decision rule heterogeneity.

 

In the LC model the probability that decision maker n chooses alternative i, equals the sum of the probability that he/she belongs to class s multiplied by the probability that i is chosen given the class s.

 

 

 

 

 

 

The class membership model π_ns is typically a logit model. Class membership is a function f (*) of explanatory variables z_n (e.g. socio-demographic characteristics), where η_s denotes a vector of class-membership parameters that need to be estimated, and δ_s denotes class-specific constants.

 

 

 

 

 

 

 

 

 

In the context of advanced RRM models, an interesting research avenue is to define classes corresponding to different decision models.

 

Below three LC applications are given:

1) a two-class model comprising of a RUM class and a P-RRM class                                    (PYTHON, PANDAS, Apollo R and MATLAB).

2) a two-class model comprising of two μRRM classes                                                           (PYTHON, PANDAS, LatentGOLD, Apollo and MATLAB).

3) a three-class model comprising of a RUM class, a P-RRM class and a μRRM class        (PYTHON, PANDAS, Apollo R and MATLAB).

 

Since in the shopping choice data do not contain explanatory variables that can be used to explain class membership, only class-specific constants are estimated (hence, the LC models are basically discrete mixture models)

 

Software code to estimate LC models is available for BIOGEME (PYTHON & PANDAS), Apollo RLatentGOLD CHOICEand MATLAB. Because of the ease of interpretation, the MATLAB code uses Maximum Likelihood Estimation (MLE). However, MLE is relatively slow for Latent Class discrete choice models. Estimation code based on Expectation-Maximisation is distributed on request. Furthermore, note that the parameterisation of the μRRM model in LatentGOLD CHOICE is somewhat different from the parameterisation in Cranenburgh et al. 2015. Therefore, an accompanying document is provided showing how to compare the results of LatentGOLD with e.g. MATLAB.

 

MATLAB

  • Click here for a bundle of MATLAB codes to estimate Latent class models.

 

PYTHON BIOGEME

  • Click here for PYTHON BIOGEME code to estimate a two-class model comprising of a RUM and a P-RRM class.

  • Click here for PYTHON BIOGEME code to estimate a two-class model comprising of two μRRM classes.

  • Click here for PYTHON BIOGEME code to estimate a three-class model comprising of a RUM class, a P-RRM class and a μRRM class

PANDAS BIOGEME

  • Click here for PANDAS BIOGEME code to estimate a two-class model comprising of a RUM and a P-RRM class.

  • Click here for PANDAS BIOGEME code to estimate a two-class model comprising of two μRRM classes.

  • Click here for PANDAS BIOGEME code to estimate a three-class model comprising of a RUM class, a P-RRM class and a μRRM class

Apollo R

  • Click here for Apollo R code to estimate a two-class model comprising of a RUM and a P-RRM class.

  • Click here for Apollo R code to estimate a two-class model comprising of two μRRM classes.

  • Click here for Apollo R code to estimate a three-class model comprising of a RUM class, a P-RRM class and a μRRM class

LatentGOLD

  • Click here for a bundle of LatentGOLD codes to estimate Latent class RRM models.

  • Click here to download  a tutorial on estimating RRM models using LatentGOLD

 

EXAMPLE DATA FILE

  • Click here to download the example shopping choice data file

 

 

ESTIMATION RESULTS

The table below shows the estimation results for standard 'single class' and Latent Class models for the example shopping choice data. Based on the table below a number of observations can be made. 1) The results show that accommodating for decision rule heterogeneity substantially improves model fit. 2) The 3-class model with one RUM class, one P-RRM class and one μRRM class statistically performs best. 3) The LC models accommodate for both taste heterogeneity as well as decision rule heterogeneity. Looking for instance at the LC model with 3 μRRM classes, we see that the scale parameters μ of classes 2 and 3 are rougly the same. This suggests that the implied decision rules are by and large the same across the two classes. The considerable differences between the parameter estimates however clearly signal the presence of taste heterogeneity. In class 2, the negative taste parameter B_FSG indicates that members of this class assign a negative value to an increase in Floor space for groceries. In contrast, the postive taste parameter B_FSG of class 3 indicates that members of this class conceive an increase in Floor space for groceries as being positive. 4) All identified classes attain a membership probabilities higher than 0.30.  

© 2015 Sander van Cranenburgh