MaxDiff
![]() | This article's lead section may be too long. (October 2015) |
![]() |
The MaxDiff model is actually a special case of the more general Best-Worst Scaling (BWS) technique, a discrete choice model first described by Jordan Louviere in 1987 while on the faculty at the University of Alberta. Louviere attributes the idea to the early work of Anthony A. J. Marley in his PhD thesis, who together with Duncan Luce in the 1960s produced much of the ground-breaking research in mathematical psychology and psychophysics to axiomatise utility theory. The first working papers and publications appeared in the early 1990s. With BWS, survey respondents are shown a set of the possible items and are asked to indicate the best and worst items (or most and least important, or most and least appealing, etc.). The definitive {By whom} textbook describing the theory, methods and applications has now been published by Cambridge University Press by Jordan Louviere (University of South Australia), Terry N Flynn (TF Choices Ltd) and Anthony A. J Marley (University of Victoria and University of South Australia).[1] The book brings together the disparate research from various academic and practical disciplines, in order that replication and mistakes in implementation are avoided. The three authors have already published the key academic peer-reviewed articles describing BWS theory,[2][3][4] practice,[5][6] and a number of applications in health,[7] social care,[8] marketing, transport, voting,[9] and environmental economics.[10]
The book distinguishes two different purposes of BWS - as a method of data collection, and/or as a theory of how people make choices when confronted with three or more items. This distinction is crucial, given the continuing misuse of the term maxdiff to describe the method. As Marley and Louviere describe, the maxdiff is a long-established academic mathematical theory with very specific assumptions about how people make choices:[11] it assumes that respondents evaluate all possible pairs of items within the displayed set and choose the pair that reflects the maximum difference in preference or importance. BWS may be thought of as a variation of the method of Paired Comparisons. Consider a set in which a respondent evaluates four items: A, B, C and D. If the respondent says that A is best and D is worst, these two responses inform us on five of six possible implied paired comparisons:
- A > B, A > C, A > D, B > D, C > D
The only paired comparison that cannot be inferred is B vs. C. In a choice among five items, MaxDiff questioning informs on seven of ten implied paired comparisons.
Yet respondents can produce best-worst data in any of a number of ways. Instead of evaluating all possible pairs (the maxdiff model), they might choose the best from n items, the worst from the remaining n-1, or vice versa. Or indeed they may use another method entirely. Thus it should be clear that maxdiff is a subset of BWS. Indeed as the number of items increases, the number of possible pairs increases in a multiplicative fashion: n items produces n(n-1) pairs (where best-worst order matters). Assuming respondents do evaluate all possible pairs is a strong assumption and in 14 years of presentations, the three co-authors have virtually never found a course or conference participant who admitted to using this method to elicit their best and worst choices. Virtually all use sequential models (best then worst or worst then best).[12] Early work did use the term maxdiff to refer to BWS, but with the recruitment of Marley to the team developing the method, correct academic terminology has been disseminated throughout Europe and Asia-Pacific (if not North America, which continues to use the maxdiff term). Indeed it is far from clear that the major software manufacturers of discrete choice models actually implement maxdiff models in estimating parameters of their models, despite this continuing advertising of maxdiff capabilities. The re-naming of the method, to make clear that maxdiff scaling is BWS but BWS is not necessarily maxdiff, was decided by Louviere in consultation with his two key contributors (Flynn and Marley) in preparation for the book, and was presented in an article by Flynn.[13]
Process
The basic steps are:
- Conduct proper qualitiative or other research to properly identify and describe all items of interest.[14]
- Construct a statistical design that indicates what items are to be presented in each set of items ("choice set") - designs may come from publicly available catalogues, be constructed by hand, or produced from commercially available software.
- Use the design to construct the choice sets, which contain the actual relevant items (textually or visually).
- Obtain response data where respondents choose the best and worst from each task; repeat best-worst (to obtain second best, second worst, etc) may be conducted if the analyst wishes for more data.
- Input the data into a statistical software program and analyse. The software will produce utility functions for each of the features. In addition to utility scores, you can also request raw counts which will simply sum the total number of times a product was selected as best and worst. These utility functions indicate the perceived value of the product on an individual level and how sensitive consumer perceptions and preferences are to changes in product features.
Analysis
Estimation of the utility function is performed using any of a variety of methods.
- multinomial discrete choice analysis, in particular multinomial logit (strictly speaking the conditional logit, although the two terms are now used interchangeably). The multinomial logit (MNL) model is often the first stage in analysis and provides a measure of average utility for the attribute levels or objects (depending on the Case).
- In many cases, particularly cases 1 and 2, simple observation and plotting of choice frequencies shoud actually be the first step, as it is very useful in identifying preference heterogeneity and respondents using decision-rules based on a single attribute.
- Several algorithms could be used in this estimation process, including maximum likelihood, neural networks, and the Hierarchical Bayes model. The Hierarchical Bayes model is beneficial because it allows for borrowing across the data, although since BWS often allows the estimation of individual level models, the benefits of Bayesian models are heavily attenuated. Response time models have recently been shown to replicate the utility estimates of BWS, which represents a major step forward in the validation of stated preferences generally, and BWS preferences specifically.[15][16]
External sources
- Almquist, Eric; Lee, Jason (April 2009), What Do Customers Really Want?, Harvard Business Review, retrieved 15 February 2010
- Cohen, Steve and Paul Markowitz (2002), “Renewing Market Segmentation: Some New Tools to Correct Old Problems,” ESOMAR 2002 Congress Proceedings, 595-612, ESOMAR: Amsterdam, The Netherlands.
- Cohen, Steven H. (April 2003). "Maximum Difference Scaling: Improved Measures of Importance and Preference for Segmentation". Proceedings of the Sawtooth Software Conference. San Antonio, TX. pp. 61–74.
{{cite conference}}
: Unknown parameter|booktitle=
ignored (|book-title=
suggested) (help) - Louviere, J. J. (1991), “Best-Worst Scaling: A Model for the Largest Difference Judgments,” Working Paper, University of Alberta.
- Louviere, J.J.; Flynn, T.N.; Marley, A.A.J., “Best-Worst Scaling: Theory, Methods and Applications”, Cambridge University Press, Cambridge (September 2015)
- Thurstone, L. L. (1927), “A Law of Comparative Judgment,” Psychological Review, 4, 273-286.
References
- ^ "Best-Worst Scaling". Cambridge University Press. Retrieved 30 September 2015.
- ^ Marley, Anthony AJ; Louviere, Jordan J. (1 January 2005). "Some probabilistic models of best, worst, and best–worst choices". Journal of Mathematical Psychology. 49 (6): 464–480.
- ^ Marley, A. A. J.; Flynn, Terry N.; Louviere, J. J. (1 January 2008). "Probabilistic models of set-dependent and attribute-level best–worst choice". Journal of Mathematical Psychology. 52 (5): 281–296.
- ^ Marley, A. A. J.; Pihlens, D. (1 January 2012). "Models of best–worst choice and ranking among multiattribute options (profiles)". Journal of Mathematical Psychology. 56 (1): 24–34.
- ^ Flynn, Terry N.; Louviere, Jordan J.; Peters, Tim J.; Coast, Joanna (1 January 2007). "Best-worst scaling: What it can do for health care research and how to do it". Journal of Health Economics. 26 (1): 171–189.
- ^ Louviere, Jordan; Lings, Ian; Islam, Towhidul; Gudergan, Siegfried; Flynn, Terry (1 January 2013). "An introduction to the application of (case 1) best–worst scaling in marketing research". International Journal of Research in Marketing. 30 (3): 292–303.
- ^ Flynn, Terry N.; Louviere, Jordan J.; Peters, Tim J.; Coast, Joanna (1 January 2007). "Best-worst scaling: What it can do for health care research and how to do it". Journal of Health Economics. 26 (1): 171–189.
- ^ Potoglou, Dimitris; Burge, Peter; Flynn, Terry; Netten, Ann; Malley, Juliette; Forder, Julien; Brazier, John E. (1 January 2011). "Best–worst scaling vs. discrete choice experiments: An empirical comparison using social care data". Social science & medicine. 72 (10): 1717–1727.
- ^ García-Lapresta, José Luis; Marley, Anthony AJ; Martínez-Panero, Miguel (1 January 2010). "Characterizing best–worst voting systems in the scoring context". Social Choice and Welfare. 34 (3): 487–496.
- ^ Scarpa, Riccardo; Notaro, Sandra; Louviere, Jordan; Raffaelli, Roberta (19 June 2011). "Exploring Scale Effects of Best/Worst Rank Ordered Choice Data to Estimate Benefits of Tourism in Alpine Grazing Commons". American Journal of Agricultural Economics: aaq174. doi:10.1093/ajae/aaq174. ISSN 0002-9092.
- ^ Marley, Anthony AJ; Louviere, Jordan J. (1 January 2005). "Some probabilistic models of best, worst, and best–worst choices". Journal of Mathematical Psychology. 49 (6): 464–480.
- ^ Flynn, Terry; Louviere, Jordan; Peters, Tim; Coast, Joanna (1 January 2008). "Estimating preferences for a dermatology consultation using Best-Worst Scaling: Comparison of various methods of analysis". BMC medical research methodology. 8 (1): 76.
- ^ Flynn, Terry N. (1 January 2010). "Valuing citizen and patient preferences in health: recent developments in three types of best–worst scaling". Expert review of pharmacoeconomics & outcomes research. 10 (3): 259–267.
- ^ Coast, Joanna; Al-Janabi, Hareth; Sutton, Eileen J.; Horrocks, Susan A.; Vosper, A. Jane; Swancutt, Dawn R.; Flynn, Terry N. (1 January 2012). "Using qualitative methods for attribute development for discrete choice experiments: issues and recommendations". Health economics. 21 (6): 730–741.
- ^ Hawkins, Guy E.; Marley, A. A. J.; Heathcote, Andrew; Flynn, Terry N.; Louviere, Jordan J.; Brown, Scott D. (1 January 2014). "Integrating cognitive process and descriptive models of attitudes and preferences". Cognitive science. 38 (4): 701–735.
- ^ Hawkins, Guy E.; Marley, A. A. J.; Heathcote, Andrew; Flynn, Terry N.; Louviere, Jordan J.; Brown, Scott D. (1 January 2014). "The best of times and the worst of times are interchangeable". Decision. 1 (3): 192.