Exploratory causal analysis
This article, Exploratory causal analysis, has recently been created via the Articles for creation process. Please check to see if the reviewer has accidentally left this template after accepting the draft and take appropriate action as necessary.
Reviewer tools: Inform author |
Exploratory causal analysis (ECA), also known as data causality or causal discovery[1] is the use of statistical algorithms to infer associations in observed data sets that are potentially causal under strict assumptions. ECA is a type of causal inference distinct from causal modeling and treatment effects in randomized controlled trials[2]. It is exploratory research usually preceding more formal causal research in the same way exploratory data analysis often precedes statistical hypothesis testing in data analysis[3][4].
Motivation
Many authors have pointed out that everyday data analysis is primarily concerned with causal questions[1][2][5][6][7]. For example, did the fertilizer cause the crops to grow?[8] Or, can a given sickness be prevented?[9] Or, why is my friend depressed?[10] Formal statistical approaches, such as potential outcomes and regression analysis, have been developed to handle such queries using designed experiments. Unfortunately, most collected data are observational and require care when being used for causal inference (because, for example, of issues such as confounding)[11]. The same causal inference techniques used with experimental data require additional assumptions to produce reasonable inferences with observation data[12]. The difficulty of causal inference under such circumstances is often summed up as "correlation does not imply causation".
Overview
The basic idea of ECA is that there exist data analysis procedures performed on specific subsets of variables within a larger set whose outputs might be indicative of causality between those variables. For example, if we assume every relevant covariate in the data is observed, then propensity score matching can be used to find the causal effect between two observational variables[2]. Granger causality can also be used to find the causality between two observational variables under different, but similarly strict, assumptions[13].
The two broad approaches to developing such procedures are using operational definitions of causality[3] or verification by "truth" (i.e., explicitly ignoring the problem of defining causality and showing that a given algorithm implies a causal relationship in scenarios when causal relationships are known to exist, e.g., using synthetic data[1]).
Operational Definitions of Causality
One of the first operational definitions of causality was provided by Clive Granger in 1969[14]. Granger began with the definition of probabilistic causality proposed by Norbert Wiener and interpreted it as a comparison of variances; in Granger's own words[15]
An earlier concept that I was concerned with was that of causality. As a postdoctoral student in Princeton in 1959–1960, working with Professors John Tukey and Oskar Morgenstern, I was involved with studying something called the “cross-spectrum,” which I will not attempt to explain. Essentially one has a pair of inter-related time series and one would like to know if there are a pair of simple relations, first from the variable X explaining Y and then from the variable Y explaining X. I was having difficulty seeing how to approach this question when I met Dennis Gabor who later won the Nobel Prize in Physics in 1971. He told me to read a paper by the eminent mathematician Norbert Wiener which contained a definition that I might want to consider. It was essentially this definition, somewhat refined and rounded out, that I discussed, together with proposed tests in the mid 1960’s. The statement about causality has just two components:
1. The cause occurs before the effect; and
2. The cause contains information about the effect that that is unique, and is in no other variable.
A consequence of these statements is that the causal variable can help forecast the effect variable after other data has first been used. Unfortunately, many users concentrated on this forecasting implication rather than on the original definition.
Thus, Granger defines C as the "cause" of E if the probability of E occurring given that you know everything is different from the probability of E occurring given that you known everything except C. This statement in standard notation is where is all the knowledge available in the universe[3]. Granger created an operational definition of causality (i.e., "definitions of causality and feedback which permit tests for their existence"[14]) from this "philosophical definition" in Definition 1 of his 1969 paper[14] as follows: Time series is causing time series if
where is the variance, is all the information in the universe, and is all the past (represented by the over-bar) information in the universe except the series . (Granger causality has continued to evolve since this definition was introduced.)
Some authors prefer using ECA techniques developed using operational definitions of causality because they believe it may help in the search for causal mechanisms[3][16].
Verification by "Truth"
Peter Spirtes, Clark Glymour, and Richard Scheines are among the most influential authors in the field of causal discovery and have promoted the idea of explicitly not providing a definition of causality; in their own words[1]
Why is so much of statistical application and so little of statistical theory concerned with causal inference? One reason sometimes given for avoiding any attempt at a mathematical analysis joining causality and probability is that the idea of causality involves a lot of metaphysical murk which a mathematical science does well to avoid. Some people try to give accounts of causation entirely in terms of probability relations, while others try to characterize causation in terms of counterfactual conditions. Who is to say which is right, or if the counterfactual characterizations are even partly right, what exactly is meant? (We, certainly, have no definition of causation to promote.)
and (from the same source)
Views about the nature of causation divide very roughly into those that analyze causal influence as some sort of probabilistic relation, those that analyze causal influence as some sort of counterfactual relation (sometimes a counterfactual relation having to do with manipulations or interventions), and those that prefer not to talk of causation at all. We advocate no definition of causation, but...we try to make our usage systematic, and to make explicit our assumptions connecting causal structure with probability, counterfactuals and manipulations. With suitable metaphysical gyrations the assumptions could be endorsed from any of these points of views, perhaps including even the last.
Spirtes and Glymour introduced the PC algorithm for causal discovery in 1990 and showed its veracity through comparisons with an existing discovery algorithm (the SGS algorithm) and by formal comparison with a known graph structure (e.g., counting how many times the algorithm suggested an edge in the graph with an incorrect orientation)[17]. Many recent discovery algorithms take a similar approach to verification[18].
Techniques
Many authors present surveys of causal discovery techniques[1][3][18][19][20][21]. This section lists a few of the more well-known techniques along with references and links to help the interested user. Software packages for implementing many of these techniques are presented below.
Bivariate (or "pairwise")
- Granger causality (there is also the Scholarpedia entry [1])
- transfer entropy
- convergent cross mapping
Multivariate
Many of these techniques are discussed in the tutorials provided by the Center for Causal Discovery (CCD) [3].
Use-case Examples
Social Science
Spirtes et al. have presented many examples of the PC algorithm applied to social science data sets[1].
Medicine
An illustration of using the PC algorithm with medical data can be found in the work of Cheek et al.[26]. Granger causality has been used to explore causality in fMRI data[27]. CCD also tested their tools using biomedical data [4].
Physics
In physics, ECA is often applied to try to better understand the physical causal mechanisms of the system, e.g., in geophysics using the PC-stable algorithm (a variant of the original PC algorithm)[28] and in dynamical systems using pairwise asymmetric inference (a variant of convergent cross mapping)[29].
"Exploratory" Emphasis
Most criticism of causal discovery comes from a lack of emphasis on its exploratory nature[1]. Freedman and Humphreys provide a typical example of ECA criticism[23]:
Spirtes, Glymour, and Scheines have developed algorithms for causal discovery. We have been quite critical of their work. Korb and Wallace, as well as SGS, have tried to answer the criticisms. This paper will continue the discussion. Their responses may lead to progress in clarifying assumptions behind the methods, but there is little progress in demonstrating that the assumptions hold true for any real applications. The mathematical theory may be of some interest, but claims to have developed a rigorous engine for inferring causation from association are premature at best. The theorems have no implications for samples of any realistic size. Furthermore, examples used to illustrate the algorithms are diagnostic of failure rather than success. There remains a wide gap between association and causation.
Judea Pearl has emphasized that causal inference requires a causal model developed by "intelligence" through an iterative process of testing assumptions and fitting data[5]. In his own words[5],
Much of this data-centric history still haunts us today. We live in an era that presumes Big Data to be the solution to all our problems. Courses in "data science" are proliferating in our universities, and jobs for "data scientists" are lucrative in the companies that participate in the "data economy". But I hope with this book to convince you that data are profoundly dumb.
All causal discovery techniques rely on assumptions that may not hold for a given data set[1]. There are assumptions regarding the ability to infer causality from data[12][30][31] and assumptions about the data itself[32]. Thus, any causal relationships discovered during ECA are contingent on these assumptions holding true[23].
ECA is not meant to replace intelligence and experimentation in developing causal models[33]. (In many cases, the intelligence required for creating "true" causal models is a human being as proposed by Freedman and Humphreys[23], but Pearl has proposed that this task might also be handled by strong AI[5].) Spirtes, Glymour, and Scheines suggest running several different causal discovery algorithms on the same data set (e.g., using different tests for statistical independence) and using "common sense" to eliminate erroneous discovered causal relationships[1]. Other authors have proposed using the output of ECA to help create experiments more specifically designed to collect the data an experimenter actually cares about[3].
Software Packages
Comprehensive toolkits
- Tetrad [5]
- Tetrad is an open source GUI-based Java program that provides the most comprehensive collection of causal discovery algorithms currently available [6]. The algorithm library used by Tetrad is also available as a command-line tool, Python API, and R wrapper [7].
- Java Information Dynamics Toolkit (JIDT) [8]
- JIDT is an open source Java library for performing information-theoretic causal discovery (i.e., transfer entropy, conditional transfer entropy, etc.)[9]. Examples of using the library in MATLAB, GNU Octave, Python, R, Julia and Clojure are provided in the documentation [10].
- pcalg [11]
- pcalg is an R package that provides some of the same causal discovery algorithms provided in Tetrad [12].
Specific Techniques
Granger causality
convergent cross mapping
LiNGAM
- MATLAB/GNU Octave package [16]
There is also a useful collection of tools and data maintained by the Causality Workbench team [17] and the CCD team [18].
References
- ^ a b c d e f g h i j k Spirtes, P.; Glymour, C.; Scheines, R. (2012). Causation, Prediction, and Search. Springer Science & Business Media. ISBN 978-1461227489.
{{cite book}}
: CS1 maint: multiple names: authors list (link) - ^ a b c Rosenbaum, Paul (2017). Observation and Experiment: An Introduction to Causal Inference. Harvard University Press. ISBN 9780674975576.
- ^ a b c d e f McCracken, James (2016). Exploratory Causal Analysis with Time Series Data (Synthesis Lectures on Data Mining and Knowledge Discovery). Morgan & Claypool Publishers. ISBN 978-1627059343.
- ^ Tukey, John W. (1977). Exploratory Data Analysis. Pearson. ISBN 978-0201076165.
- ^ a b c d Pearl, Judea (2018). The Book of Why: The New Science of Cause and Effect. Vol. 361. Basic Books. p. 855. Bibcode:2018Sci...361..855.. doi:10.1126/science.aau9731. ISBN 978-0465097616.
{{cite book}}
:|journal=
ignored (help) - ^ Kleinberg, Samantha (2015). Why: A Guide to Finding and Using Causes. O'Reilly Media, Inc. ISBN 978-1491952191.
- ^ Illari, P.; Russo, F. (2014). Causality: Philosophical Theory meets Scientific Practice. OUP Oxford. ISBN 978-0191639685.
{{cite book}}
: CS1 maint: multiple names: authors list (link) - ^ Fisher, R. (1937). The design of experiments. Oliver And Boyd.
- ^ Hill, B. (1955). Principles of Medical Statistics. Lancet Limited.
- ^ Halpern, J. (2016). Actual Causality. MIT Press. ISBN 978-0262035026.
- ^ Pearl, J., Glymour, M., and Jewell, N. P. (2016). Causal inference in statistics: a primer. John Wiley & Sons. ISBN 978-1119186847.
{{cite book}}
: CS1 maint: multiple names: authors list (link) - ^ a b Stone, R. (1993). "The Assumptions on Which Causal Inferences Rest". Journal of the Royal Statistical Society. Series B (Methodological). 55 (2): 455–466.
- ^ Granger, C (1980). "Testing for causality: a personal viewpoint". Journal of Economic Dynamics and Control. 2: 329–352.
- ^ a b c Granger, C. W. J. (1969). "Investigating Causal Relations by Econometric Models and Cross-spectral Methods". Econometrica. 37 (3): 424–438. doi:10.2307/1912791. JSTOR 1912791.
- ^ Granger, Clive. "Prize Lecture. NobelPrize.org. Nobel Media AB 2018".
- ^ Woodward, James (2004). Making Things Happen: A Theory of Causal Explanation (Oxford Studies in the Philosophy of Science). Oxford University Press. ISBN 978-1435619999.
- ^ Spirtes, P., Glymour, C. (1991). "An algorithm for fast recovery of sparse causal graphs". Social Science Computer Review. 9 (1): 62–72.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ a b Guo, R., Cheng, L., Li, J., Hahn, P. R., and Liu, H. (2018). "A Survey of Learning Causality with Data: Problems and Methods". arXiv:1809.09337 [cs.AI].
{{cite arXiv}}
: CS1 maint: multiple names: authors list (link) - ^ Malinsky, D., Danks, D (2018). "Causal discovery algorithms: A practical guide". Philosophy Compass. 13 (1): e12470. doi:10.1111/phc3.12470.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Spirtes P, Zhang K. (2016). "Causal discovery and inference: concepts and recent methodological advances". Appl Inform (Berl). 3: 3. doi:10.1186/s40535-016-0018-x. PMC 4841209. PMID 27195202.
{{cite journal}}
: CS1 maint: unflagged free DOI (link) - ^ Yu, K., Li, J., and Liu, L. (2016). "A review on algorithms for constraint-based causal discovery". arXiv:1611.03977 [cs.AI].
{{cite arXiv}}
: CS1 maint: multiple names: authors list (link) - ^ Bollt, E.; Sun, J. (2014). "Causation entropy identifies indirect influences, dominance of neighbors and anticipatory couplings". Physica D: Nonlinear Phenomena. 267: 49–57. arXiv:1504.03769. Bibcode:2014PhyD..267...49S. doi:10.1016/j.physd.2013.07.001.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ a b c d Freedman, D.; Humphreys, P. (1999). "Are there algorithms that discover causal structure?". Synthese. 121 (1–2): 29–54. doi:10.1023/A:100527761 (inactive 2018-12-19).
{{cite journal}}
: CS1 maint: DOI inactive as of December 2018 (link) CS1 maint: multiple names: authors list (link) - ^ Raghu, V.K., Ramsey, J.D., Morris, A.; et al. (2018). "Comparison of strategies for scalable causal discovery of latent variable models from mixed data". Int J Data Sci Anal. 6 (33): 33–45. doi:10.1007/s41060-018-0104-3. PMC 6096780. PMID 30148202.
{{cite journal}}
: Explicit use of et al. in:|last=
(help)CS1 maint: multiple names: authors list (link) - ^ Shimizu, S (2014). "LiNGAM: non-Gaussian methods for estimating causal structures". Behaviormetrika. 41 (1): 65–98. doi:10.2333/bhmk.41.65.
- ^ Cheek C, Zheng H, Hallstrom BR, Hughes RE (2018). "Application of a Causal Discovery Algorithm to the Analysis of Arthroplasty Registry Data". Biomed Eng Comput Biol. 9: 117959721875689. doi:10.1177/1179597218756896. PMC 5826097. PMID 29511363.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Wen, X.; Rangarajan, G.; Ding, M. (2013). "Is Granger Causality a Viable Technique for Analyzing fMRI Data?". PloS One. 8 (7): e67428. Bibcode:2013PLoSO...867428W. doi:10.1371/journal.pone.0067428. PMC 3701552. PMID 23861763.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) CS1 maint: unflagged free DOI (link) - ^ Ebert-Uphoff, I.; Deng, Y. (2017). "Causal discovery in the geosciences—Using synthetic data to learn how to interpret results". Computers & Geosciences. 99: 50–60. Bibcode:2017CG.....99...50E. doi:10.1016/j.cageo.2016.10.008.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ McCracken, J.; Weigel, R. (2014). "Convergent cross-mapping and pairwise asymmetric inference". Phys. Rev. E. 90 (6): 062903. arXiv:1407.5696. Bibcode:2014PhRvE..90f2903M. doi:10.1103/PhysRevE.90.062903. PMID 25615160.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Scheines, R (1997). "An introduction to causal inference" (PDF). Causality in Crisis: 185–199.
- ^ Holland, P. W. (1986). "Statistics and causal inference". Journal of the American Statistical Association. 81 (396): 945–960. doi:10.1080/01621459.1986.10478354.
- ^ Imbens, G. W.; Rubin, D. B. (2015). Causal inference in statistics, social, and biomedical sciences. Cambridge University Press. ISBN 978-0521885881.
{{cite book}}
: CS1 maint: multiple names: authors list (link) - ^ Morgan, S. L.; Winship, C (2015). Counterfactuals and causal inference. Cambridge University Press. ISBN 978-1107065079.
{{cite book}}
: CS1 maint: multiple names: authors list (link)