Jump to content

Interactive visual analysis

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Geirsmestad (talk | contribs) at 14:03, 8 April 2013 (Applications). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Interactive Visual Analysis (IVA) is a set of techniques for combining the computational power of computers with the perceptive and cognitive capabilities of humans, in order to extract knowledge from large and complex datasets. The techniques rely heavily on user interaction and the human visual system, and are hence exist in the intersection between visual analytics and big data. It is a branch of data visualization. IVA is a suitable technique for analyzing high-dimensional data that has a large number of data points, where simple graphing and non-interactive techniques give an insufficient understanding of the information. [1]

These techniques involve looking at datasets through different, correlated views and iteratively selecting and examining features the user finds interesting. The objective of IVA is to gain knowledge which is not readily apparent from a dataset, typically in tabular form. This can involve generating, testing or verifying hypotheses, or simply exploring the dataset to look for correlations between different variables.

History

Focus + Context visualization and its related techniques date back to the 1970s. [2] Early attempts at combining these techniques for Interactive Visual Analysis occur in the WEAVE visualization system for cardiac simulation [3] in the year 2000. SimVis appeared in 2003 [4], and multiple Ph. D. projects have explored the concept since then - notably Helmut Doleisch in 2004 [5], Johannes Kehrer in 2011 [6] and Zoltan Konyha in 2013 [7]. ComVis, which is used in the visualization community, appeared in 2008. [8].

Basics

The objective of Interactive Visual Analysis is to discover information in data which is not readily apparent. We want to move from the data itself to the information contained in the data, ultimately uncovering knowledge which was not apparent from looking at the raw numbers.

The most basic form of IVA is to use coordinated multiple views [9] displaying different columns of our dataset. We typically want at least two views, but possibly more. These views are usually among the common tools of information visualization, such as histograms, scatterplots or parallel coordinates, but using volume rendered views is also possible if this is appropriate for the data. [6] Typically, we want one view to display the independent variables of the dataset (e.g. time or spatial location), while the others display the dependent variables (e.g. temperature, pressure or population density) in relation to each other. If the views are linked, we can select data points in one view and have the corresponding data points get automatically highlighted in the other views. This technique, which intuitively lets us explore higher-dimensional properties of the data, is known as linking and brushing. [10] [11]

The selection we make in one view doesn’t have to be binary. We can allow for a gradual “degree of interest” [12] [6] [5] in our selection, where data points are gradually highlighted as we move from low to high interest. This allows for an inherent “focus+context” [13] aspect to our search for information.

The IVA loop

Interactive Visual Analysis is an iterative process. Discoveries made after brushing of the data and looking at the linked views can be used as a starting point for repeating the process, leading to a form of information drill-down. As an example, consider the analysis of data from a simulation of a combustion engine. The user brushes a histogram of temperature distribution, and discovers that one specific part of one cylinder has dangerously high temperatures. This information can be used to formulate the hypothesis that all cylinders have a problem with heat dissipation. This could be verified by brushing the same region in all other cylinders and seeing in the temperature histogram that these cylinders also have higher temperatures than expected. [14]

Data model

The data source for IVA is usually tabular data where the data is represented in columns and rows. The data variables can be divided into two different categories: independent and dependent variables. The independent variables represent the domain of the observed values, such as for instance time and space. The dependent variables represent the data being observed, for instance temperature, pressure or height. [14]

IVA can help the user uncover information and knowledge about data sources that have fewer dimensions as well as datasets that have a very large number of dimensions. [15]

Levels of IVA

The IVA tools can be divided into several different levels of complexity. These levels provides the user with different interactive tools to analyze the data. For most uses the lower level is more than needed, this is also the level that provides the user with the fastest response from the interaction. The higher levels provides the user with more hidden information. However, this requires more knowledge about the tools and the interaction has longer response time. [1]

Base level

The most simple level is the base level which consists of brushing and linking. Here the user can set up several graphs with different variables and mark an interesting area in one of the graph with for example color. The corresponding points in the other graphs are then marked automatically with the same color. Very much information can usually be derived from this level of IVA. [7]

Second level

Brushing and linking with logical combination is the second level of IVA. Here it is possible for the user to mark several areas in one or several graphs and combine these areas with the logical attribute; and, or, not. This makes it possible to go deeper into the dataset and see more hidden information. [7]

Third level

Level three provides the user with two tools that makes it possible for the user to see more hidden information. The first tool is attribute derivation. Here the user can derive different attributes like estimation of derivations, clustering information and other statistic information. The derived attributes can then be linked and brushed like any other attribute. [7]

The second tool in level three of IVA is advanced brushing like for example angular brushing, similarity brushing and percentile brushing. This methods generates a faster response than the attribute derivation but is harder to understand because the user has a higher learning curve. [7]

Fourth level

This level is much more specific to each dataset and varies dependent on the dataset and the purpose. It provides the user with a deep drill-down into the dataset. [1]

Patterns of IVA

The "linking and brushing" (selection) concept of IVA can be used between different types of variables in the dataset. Which pattern we should use depends on which aspect of the correlations in the dataset are of interest. [1]

Feature localization

Brushing data points from the set of dependent variables (e.g. temperature) and seeing where among the independent variables (e.g. space or time) these data points show up, is called "feature localization". With feature localization, the user can easily identify the location of features in the dataset. Examples from a meteorological dataset would be which regions have a warm climate or which times of the year have a lot of precipitation. [1]

Local investigation

If independent variables are brushed and we look for the corresponding connection to a dependent view, this is termed "local investigation". This makes it possible to investigate the characteristics of for example a specific region or specific time. In the case of meteorological data, we could for instance discover the temperature distribution during the winter months. [1]

Multivariate analysis

Brushing dependent variables and watching the connection to other dependent variables is called multivariate analysis. This could for example be used to find out if high temperatures are correlated with pressure by brushing high temperatures and watching a linked view of pressure distributions.

Since each of the linked views usually has two or more dimensions, multivariate analysis can implicitly uncover higher-dimensional features of the data which would not be readily apparent from e.g. a simple scatterplot.[1]

Applications

Concepts from Interactive Visual Analysis have been implemented in multiple software packages, both for researchers and commercial purposes.

ComVis is often used by visualization researchers in academia, while SimVis is optimized for analyzing simulation data. [16] [8]. Tableau is another example of a commercial software product utilizing concepts from IVA.

See also

References

  1. ^ a b c d e f g Interactive Visual Analysis of Scientific Data. Steffen Oeltze, Helmut Doleisch, Helwig Hauser, Gunther Weber. Presentation at IEEE VisWeek 2012, Seattle (WA), USA
  2. ^ Hauser, Helwig. "Generalizing focus+ context visualization." Scientific visualization: The visual extraction of knowledge from data. Springer Berlin Heidelberg, 2006. 305-327.
  3. ^ Gresh, Donna L., et al. "WEAVE: A system for visually linking 3-D and statistical visualizations, applied to cardiac simulation and measurement data." Proceedings of the conference on Visualization'00. IEEE Computer Society Press, 2000.
  4. ^ Doleisch, Helmut, Martin Gasser, and Helwig Hauser. "Interactive feature specification for focus+ context visualization of complex simulation data." Proceedings of the symposium on Data visualisation 2003. Eurographics Association, 2003.
  5. ^ a b Doleisch, Helmut. Visual analysis of complex simulation data using multiple heterogenous views. 2004.
  6. ^ a b c Kehrer, Johannes. Interactive visual analysis of multi-faceted scientific data. Diss. PhD dissertation, Dept. of Informatics, Univ. of Bergen, Norway, 2011.
  7. ^ a b c d e Konyha, Zoltán, et al. "Interactive visual analysis of families of curves using data aggregation and derivation." Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies. ACM, 2012.
  8. ^ a b Matkovic, Krešimir, et al. "ComVis: A coordinated multiple views system for prototyping new visualization technology." Information Visualisation, 2008. IV'08. 12th International Conference. IEEE, 2008
  9. ^ Roberts, Jonathan C. "State of the art: Coordinated & multiple views in exploratory visualization." Coordinated and Multiple Views in Exploratory Visualization, 2007. CMV'07. Fifth International Conference on. IEEE, 2007.
  10. ^ Martin, Allen R., and Matthew O. Ward. "High dimensional brushing for interactive exploration of multivariate data." Proceedings of the 6th Conference on Visualization'95. IEEE Computer Society, 1995.
  11. ^ Keim, Daniel A. "Information visualization and visual data mining." Visualization and Computer Graphics, IEEE Transactions on 8.1 (2002): 1-8.
  12. ^ Doleisch, Helmut, and Helwig Hauser. "Smooth brushing for focus+ context visualization of simulation data in 3D." Journal of WSCG 10.1 (2002): 147-154.
  13. ^ Lamping, John, Ramana Rao, and Peter Pirolli. "A focus+ context technique based on hyperbolic geometry for visualizing large hierarchies." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM Press/Addison-Wesley Publishing Co., 1995.
  14. ^ a b Konyha, Zoltan, et al. "Interactive visual analysis of families of function graphs." Visualization and Computer Graphics, IEEE Transactions on 12.6 (2006): 1373-1385.
  15. ^ Hauser, Helwig. "Generalizing focus+ context visualization." Scientific visualization: The visual extraction of knowledge from data. Springer Berlin Heidelberg, 2006. 305-327.
  16. ^ Doleisch, Helmut. "SimVis: Interactive visual analysis of large and time-dependent 3D simulation data." Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come. IEEE Press, 2007.