Jump to content

Biplot

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Skbkekas (talk | contribs) at 21:25, 28 February 2011 (Include mathematical details about the construction of biplots.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Biplot of Anderson's iris data set
Spectramap biplot of Anderson's iris data set
Discriminant Analysis Biplot of Fisher's Iris Data (Greenacre, 2010)

Biplots are a type of exploratory graph used in statistics, a generalization of the simple two-variable scatterplot. A biplot allows information on both samples and variables of a data matrix to be displayed graphically. Samples are displayed as points while variables are displayed either as vectors, linear axes or nonlinear trajectories. In the case of categorical variables, category level points may be used to represent the levels of a categorical variable. A generalised biplot displays information on both continuous and categorical variables.

Introduction and history

The biplot was introduced by Gabriel (1971). Gower and Hand (1996) wrote a monograph on biplots. Yan and Kang (2003) described various methods which can be used in order to visualize and interpret a biplot. The book by Greenacre (2010) is a practical user-oriented guide to biplots, along with scripts in the open-source R programming language, to generate biplots associated with principal component analysis (PCA), multidimensional scaling (MDS), log-ratio analysis (LRA) - also known as spectral mapping[1][2] - discriminant analysis (DA) and various forms of correspondence analysis: simple correspondence analysis (CA), multiple correspondence analysis (MCA) and canonical correspondence analysis (CCA). The book by Gower, Lubbe and le Roux (2011) aims to popularize biplots as a useful and reliable method for the visualization of multivariate data when researchers want to consider, for example, principal component analysis (PCA), canonical variates analysis (CVA) or various types of correspondence analysis.

Construction

A biplot is constructed by using the singular value decomposition (SVD) to obtain a low-rank approximation to a centered vection of the data matrix X, whose n rows are the samples (also called the cases, or objects), and whose p columns are the variables. The centered data matrix Xc is obtained from the matrix X by centering the columns (the variables). Using the SVD, we can write Xc = ∑j=1,...p djUjVj′, where the Uj are n-dimensional column vectors, the Vj are p-dimensional column vectors, and the dj are a non-increasing sequence of non-negative scalars. The biplot is formed from two scatterplots that share a common set of axes. The first scatterplot is formed from the points (d11/2U1j,  d21/2U2j), for j = 1,...,n. The second plot is formed from the points (d11/2V1jd21/2V2j), for j = 1,...,p, and is usually formed by drawing line segments from the origin to these points. This is the biplot formed by the dominant two terms of the SVD. Additional biplots can be constructed by pairing other terms. Note also that the scaling of the points by dj1/2 is sometimes modified to other scalings.

References

  1. ^ Paul J. Lewi, "Spectral mapping, a personal and historical account of an adventure in multivariate data analysis." Chemometrics and Intelligent Laboratory Systems, 77, 1-2, 2005, 215-223.
  2. ^ David Livingstone (2009). A Practical Guide to Scientific Data Analysis. Chichester, John Wiley & Sons Ltd, 233-238. ISBN 9780470851531

Sources

  • Gabriel, K.R. (1971). "The biplot graphic display of matrices with application to principal component analysis". Biometrika. 58 (3): 453–467. doi:10.1093/biomet/58.3.453.
  • Gower, J.C., Lubbe, S. and le Roux, N. (due January 2011). Understanding Biplots. Wiley. ISBN 978-0-470-01255-0
  • Gower, J.C. and Hand, D.J (1996). Biplots. Chapman & Hall, London, UK. ISBN 0412716305
  • Greenacre, M. (2010). Biplots in Practice. BBVA Foundation, Madrid, Spain. Available for free download ISBN 978-84-923846-8-6, with materials.
  • Yan, W. and Kang, M.S. (2003). GGE Biplot Analysis. CRC Press, Boca Raton, FL. ISBN 0849313384
  • Vicente-Villardón, J.L., Galindo-Villardón, M.P. and Blázquez-Zaballos, A. (2006). Logistic Biplots. In: Multiple Correspondence Analysis and Related Methods. Greenacre, M. and Blasius, J. (Eds) Chapman & Hall/CRC Press. Boca Raton. USA. ISBN 1584886285
  • Demey, J.R., Vicente-Villardón, J.L., Galindo-Villardón, M.P. and Zambrano, A.Y. (2008). Identifying molecular markers associated with classification of genotypes by External Logistic Biplots. Bioinformatics. 24(24):2832-2838