Circular analysis
This article may have been previously nominated for deletion: Wikipedia:Articles for deletion/Circular analysis exists. It is proposed that this article be deleted because of the following concern:
If you can address this concern by improving, copyediting, sourcing, renaming, or merging the page, please edit this page and do so. You may remove this message if you improve the article or otherwise object to deletion for any reason. Although not required, you are encouraged to explain why you object to the deletion, either in your edit summary or on the talk page. If this template is removed, do not replace it. This message has remained in place for seven days, so the article may be deleted without further notice. Find sources: "Circular analysis" – news · newspapers · books · scholar · JSTOR Nominator: Please consider notifying the author/project: {{subst:proposed deletion notify|Circular analysis|concern=Appears to be a [[wp:neo|neologism]] used in this sense only be Kriegeskorte and his collaborators to describe a longstanding habit in scientific malpractice.}} ~~~~ Timestamp: 20130320025749 02:57, 20 March 2013 (UTC) Administrators: delete |
Circular Analysis is the selection of parameters of an analysis using the data to be analysed. It is often referred to as double dipping, as one uses the same data twice. Circular analysis inflates the statistical strength of results and, at the most extreme can result in a strongly significant result from noise.
Examples
At its most simple, it can include the decision to remove outliers, after noticing this might help improve the analysis of an experiment. The effect can be more subtle. In fMRI data, for example, considerable amounts of pre-processing is often needed. These might be applied incrementally until the analysis 'works'. Similarly, the classifiers used in a multivoxel analysis of fMRI data require parameters, which could be tuned to maximise the classification accuracy.
Solutions
Careful design of the analysis one plans to perform, prior to collecting the data, means the analysis choice is not affected by the data collected. Alternatively, one might decide to perfect the classification on one or two participants, and then use the analysis on the remaining participant data. Regarding the selection of classification parameters, a common method is to divide the data into two sets, and find the optimum parameter using one set and then test using this parameter value on the second set. This is a standard technique used (for example) by the princeton MVPA classification library.
References
Kriegeskorte, Nikolaus, et al. "Circular analysis in systems neuroscience: the dangers of double dipping." Nature neuroscience 12.5 (2009): 535-540.