Jump to content

Functional boxplot

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Sunwards (talk | contribs) at 21:13, 3 September 2011. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.


Functional Boxplots is an informative exploratory tool proposed by [1] Sun and Genton (2011) for visualizing functional data. Analogous to the classical boxplot, the descriptive statistics of a functional boxplot are: the envelope of the 50% central region, the median curve and the maximum non-outlying envelope.

To construct a functional boxplot, data ordering is the first step. In functional data analysis, each observation is a real function, therefore, different from the classical boxplot where data are simply ordered from the smallest sample value to the largest, in a functional boxplot, functional data, e.g. curves or images, are ordered by a notion of band depth or modified band depth proposed by [2]López-Pintado and Romo (2009). It allows for ordering functional data from the center outwards and, thus, introduces a measure to define functional quantiles and the centrality or outlyingness of an observation. Having the ranks of functional data, the functional boxplot is a natural extension of the classical boxplot.


Construction of Functional Boxplots

In the classical boxplot, the box itself represents the middle 50% of the data. Since the data ordering in the functional boxplot is from the center outwards, the 50% central region is defined by the band delimited by the 50% of deepest, or the most central observations. The border of the 50% central region is defined as the envelope representing the box in a classical boxplot. Thus, this 50% central region is the analog to the "interquartile range" (IQR) and gives a useful indication of the spread of the central 50% of the curves. This is a robust range for interpretation because the 50% central region is not affected by outliers or extreme values, and gives a less biased visualization of the curves' spread. The observation in the box indicates the median, or the most central observation which is also a robust statistic to measure centrality.

The "whiskers" of the boxplot are the vertical lines of the plot extending from the box and indicating the maximum envelope of the dataset except the outliers.

Outlier Detection

Outliers can be detected in a functional boxplot by the 1.5 times the 50% central region empirical rule, analogous to the 1.5 IQR empirical rule for classical boxplots. The fences are obtained by inflating the envelope of the 50% central region by 1.5 times the height of the 50% central region. Any observations outside the fences are flagged as potential outliers. When each observation is simply a point, the functional boxplot degenerates to a classical boxplot, and it is different from the pointwise boxplots.

Enhanced Functional Boxplots

By introducing the concept of central regions, the functional boxplot can be generalized to an enhanced functional boxplot where the 25% and 75% central regions are provided as well.

Examples

Software

The command fbplot for functional boxplots is in fda R package, and MATLAB code is also available.

See Also

  • Boxplot
  • Adjusted Functional Boxplots

References

  1. ^ Sun, Y., and Genton, M. G. (2011), "Functional boxplots," Journal of Computational and Graphical Statistics, 20, 316-334.
  2. ^ López-Pintado, S. and Romo, J. (2009), "On the concept of depth for functional data," Journal of the American Statistical Association, 104, 718-734.