User:Kdabug/sandbox

This is a user sandbox of Kdabug. You can use it for testing or practicing edits.
This is not the place where you work on your assigned article for a dashboard.wikiedu.org course.
Visit your Dashboard course page and follow the links for your assigned article in the My Articles section.

Get Help

EDITING FROM ARTICLE "FIELD EXPERIMENTS"

Field experiments, like lab experiments, randomly assign subjects (or other sampling units) to either treatment or control groups in order to test claims of causal relationships. Random assignment helps establish the comparability of the treatment and control group, so that any differences between them that emerge after the treatment has been administered plausibly reflect the influence of the treatment rather than pre-existing differences between the groups. The distinguishing characteristics of field experiments are that they are conducted real-world settings and often unobtrusively. This is in contrast to laboratory experiments, which enforce scientific control by testing a hypothesis in the artificial and highly controlled setting of a laboratory. Field experiments have some contextual differences as well from naturally-occurring experiments and quasi-experiments^[1]. While naturally-occurring experiments rely on an external force (e.g. a government, nonprofit, etc.) controlling the randomization treatment assignment and implementation, field experiments require researchers to retain control over randomization and implementation. Quasi-experiments occur when treatments are administered as-if randomly (e.g. U.S. Congressional districts where candidates win with slim-margins^[2], weather patterns, natural disasters, etc).

Field experiments encompass a broad array of experimental designs, each with varying degrees of generality. Some criteria of generality (e.g. authenticity of treatments, participants, contexts, and outcome measures) refer to the contextual similarities between the subjects in the experimental sample and the rest of the population. They are increasingly used in the social sciences to study the effects of policy-related interventions in domains such as health, education, crime, social welfare, and politics.

Characteristics of Field Experiments

Under random assignment, outcomes of field experiments are reflective of the real-world because subjects are assigned to groups based on non-deterministic probabilities^[3]. Two other core assumptions underly the ability of the researcher to collect unbiased potential outcomes: excludability and non-interference^[4]^[5]. The excludability assumption provides that the only relevant causal agent is through the receipt of the treatment. Asymmetries in assignment, administration or measurement of treatment and control groups violate this assumption. The non-interference assumption, or Stable Unit Treatment Value Assumption (SUTVA), indicates that the value of the outcome depends only on whether or not the subject is assigned the treatment and not whether or not other subjects are assigned to the treatment. When these three core assumptions are met, researchers are more likely to provide unbiased estimates through field experiments.

After designing the field experiment and gathering the data, researchers can use statistical inference tests to determine the size and strength of the intervention's effect on the subjects. Field experiments allow researchers to collect diverse amounts and types of data. For example, a researcher could design an experiment that uses pre- and post-trial information in an appropriate statistical inference method to see if an intervention has an effect on subject-level changes in outcomes.

Practical Uses

Field experiments offer researchers a way to test theories and answer questions with higher external validity because they simulate real-world occurrences^[6]. Some researchers argue that field experiments are a better guard against potential bias and biased estimators. As well, field experiments can act as benchmarks for comparing observational data to experimental results. Using field experiments as benchmarks can help determine levels of bias in observational studies, and, since researchers often develop a hypothesis from an a priori judgment, benchmarks can help to add credibility to a study^[7]. While some argue that covariate adjustment or matching designs might work just as well in eliminating bias, field experiments can increase certainty^[8] by displacing omitted variable bias because they better allocate observed and unobserved factors^[9].

Researchers can utilize machine learning methods to simulate, reweight, and generalize experimental data^[10]. This increases the speed and efficiency of gathering experimental results and reduces the costs of implementing the experiment. Another cutting-edge technique in field experiments is the use of the multi armed bandit design^[11], including similar adaptive designs on experiments with variable outcomes and variable treatments over time^[12].

Limitations

There are limitations of and arguments against using field experiments in place of other research designs (e.g lab experiments, survey experiments, observational studies, etc). Given that field experiments necessarily take place in a specific geographic and political setting, there is a concern about extrapolating outcomes to formulate a general theory regarding the population of interest. However, researchers have begun to find strategies to effectively generalize causal effects outside of the sample by comparing the environments of the treated population and external population, accessing information from larger sample size, and accounting and modeling for treatment effects heterogeneity within the sample^[13]. Others have used covariate blocking techniques to generalize from field experiment populations to external populations^[14].

Noncompliance issues affecting field experiments (both one-sided and two-sided noncompliance^[15]^[16]) can occur when subjects who are assigned to a certain group never receive their assigned intervention. Other problems to data collection include attrition (where subjects who are treated do not provide outcome data) which, under certain conditions, will bias the collected data. These problems can lead to imprecise data analysis; however, researchers who use field experiments can use statistical methods in calculating useful information even when these difficulties occur^[17].

Using field experiments can also lead to concerns over interference^[18] between subjects. When a treated subject or group affects the outcomes of the nontreated group (through conditions like displacement, communication, contagion etc.), nontreated groups might not have an outcome that is the true untreated outcome. A subset of interference is the spillover effect, which occurs when the treatment of treated groups has an effect on neighboring untreated groups.

Critics also point to the concern of heterogeneous treatment effects, which occur when subjects with similar attributes react to treatment categorically differently than other subjects. This creates across-treatment variability that can present problems for assessing causal mechanisms, leading some researchers to use covariate adjustment methods. Critics argue that this increases variability, error, and potential bias to results; however, others show that increasing the sample size of the data will dull the negative effects of using these and other regressive methods^[19]. Another popular method to address heterogeneous treatment effects is to use a block design, which stratifies subjects based on observable traits and randomizes treatment and control within those strata and can be effectively done before or after implementing an intervention^[20].

Field experiments can be expensive, time-consuming to conduct, and difficult to replicate. Subjects or populations might undermine the implementation process if there is a perception of unfairness in treatment selection(e.g. in 'negative income tax' experiments communities may lobby for their community to get a cash transfer so the assignment is not purely random). Similarly, those who implement the randomization could contaminate the randomization scheme. As well, there is a limit on a researcher's ability to obtain the informed consent of all participants. The resulting data, therefore, are more varied: larger standard deviation, less precision and accuracy, etc. This leads to the use of larger sample sizes for field testing. However, others argue that, even though replicability is difficult, if the results of the experiment are important then there a larger chance that the experiment will get replicated. Additionally, ethical considerations can factor into experimental designs, and field experiments can adopt a "stepped-wedge" design that will eventually give the entire sample access to the intervention on different timing schedules ^[21]. As well, researchers can design a blinded field experiment to remove possibilities of manipulation.

Examples

The history of experiments in the lab and the field has left longstanding impacts in the physical, natural, and life sciences. Modern use field experiments has roots in the 1700s, when James Lind utilized a controlled field experiment to identify a treatment for scurvy^[22].

Other categorical examples of sciences that use field experiments include:

Economists have used field experiments to analyze discrimination, health care programs, charitable fundraising, education, information aggregation in markets, and microfinance programs.
Engineers often conduct field tests of prototype products to validate earlier laboratory tests and to obtain broader feedback.
Social science researchers use field experiments to see how humans interact and respond to changes in social, political, and governmental patterns.
Geology has a long history of field experiments, since the time of Avicenna,^{[citation needed]}
Anthropology field experiments date back to Biruni's study of India^[23].
Social psychology has pioneering figures who utilized field experiments, including Kurt Lewin and Stanley Milgram.
Agricultural science researcher R.A. Fisher analyzed randomized actual "field" experimental data^[24] for crops.
Political Science researcher Harold Gosnell conducted an early field experiment in 1926 on voter participation^[25].

References

^ Bruce D. Meyer, Natural and Quasi-Experiments in Economics , Journal of Business & Economic Statistics, Vol. 13, No. 2, JBES Symposium on Program and Policy Evaluation (Apr., 1995), pp. 151-161, https://www.jstor.org/stable/1392369
^ https://www.jstor.org/stable/25098703?seq=1#page_scan_tab_contents David S. Lee, Enrico Moretti and Matthew J. Butler, 2004 Do Voters Affect or Elect Policies? Evidence from the U. S. House
^ Donald Rubin, 2011, Causal Inference Using Potential Outcomes Design, Modeling, Decisions pgs 322-331, https://doi.org/10.1198/016214504000001880
^ Par Nyman, 2017 "Door-to-door canvassing in the European elections: Evidence from a Swedish field experiment" https://doi.org/10.1016/j.electstud.2016.12.002
^ Broockman, D., Kalla, J., & Sekhon, J. (2017). The Design of Field Experiments With Survey Outcomes: A Framework for Selecting More Efficient, Robust, and Ethical Designs. Political Analysis, 25(4), 435-464. doi:10.1017/pan.2017.27
^ http://econ-www.mit.edu/files/800 Esther Duflo Field Experiments in Development Economics
^ https://www.jstor.org/stable/3594915 Harrison and List, 2004
^ https://www.jstor.org/stable/1806062?seq=1#page_scan_tab_contents LaLonde, 1986 Evaluating the Econometric Evaluations of Training Programs with Experimental Data
^ https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3033144 Gordon et al, 2017 A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook
^ https://www.pnas.org/content/113/27/7353 Susan Athey and Guido Imbens, 2016 Recursive partitioning for heterogeneous causal effects https://doi.org/10.1073/pnas.1510489113
^ http://www.economics.uci.edu/~ivan/asmb.874.pdf Steven Scott, 2010 “A modern Bayesian look at the multi-armed bandit.” Applied Stochastic Models in Business and Industry 26:639–658.
^ https://arxiv.org/abs/1707.09727 Raj and Kalyani, 2017 Taming Non-stationary Bandits: A Bayesian Approach
^ https://www.nber.org/papers/w21459 Dehejia, Rajeev, Cristian Pop-Eleches, Cyrus Samii. 2017. From Local to Global: External Validity in a Fertility Natural Experiment. NBER Working Paper 21459.
^ https://scholar.princeton.edu/sites/default/files/negami/files/covselect.pdf Egami, Naoki, and Erin Hartman. 2018. Covariate Selection for Generalizing Experimental Results. Unpublished manuscript.
^ Instrumental Variable Methods for Conditional Effects and Causal Interaction in Voter Mobilization Experiments Matthew Blackwell, 2015, https://doi.org/10.1080/01621459.2016.1246363
^ Aronow, Peter M., and Alison Carnegie. 2013. Beyond LATE: Estimation of the Average Treatment Effect with an Instrumental Variable. Political Analysis 21:492–506.https://www.cambridge.org/core/journals/political-analysis/article/beyond-late-estimation-of-the-average-treatment-effect-with-an-instrumental-variable/604E0803793175CF88329DB34DAA80B3
^ Aronow, Peter M., and Alison Carnegie. 2013. Beyond LATE: Estimation of the Average Treatment Effect with an Instrumental Variable. Political Analysis 21:492–506.https://www.cambridge.org/core/journals/political-analysis/article/beyond-late-estimation-of-the-average-treatment-effect-with-an-instrumental-variable/604E0803793175CF88329DB34DAA80B3
^ Aronow, Peter M., and Cyrus Samii. 2017. Estimating Average Causal Effects Under General Interference, with Application to a Social Network Experiment. Forthcoming, Annals of Applied Statistics. https://projecteuclid.org/euclid.aoas/1514430272
^ https://projecteuclid.org/euclid.aoas/1365527200 Lin, Winston. 2013. Agnostic Notes On Regression Adjustments to Experimental Data: Reexamining Freedman’s Critique. Annals of Applied Statistics 7(1): 295–318
^ Miratrix, Luke W., Jasjeet S. Sekhon, and Bin Yu. 2013. Adjusting treatment effect estimates by post-stratification in randomized experiments. Journal of the Royal Statistical Society -- B 75(2): 369–396. https://dash.harvard.edu/bitstream/handle/1/30501585/Adjusting%20Treatment.pdf?sequence=1&isAllowed=y
^ Woertman, Willem, et al. 2013. Stepped wedge designs could reduce the required sample size in cluster randomized trials. Journal of Clinical Epidemiology 66: 752-758 https://www.ncbi.nlm.nih.gov/pubmed/23523551
^ http://www.jameslindlibrary.org/articles/james-lind-and-scurvy-1747-to-1795/ Tröhler U (2003). James Lind and scurvy: 1747 to 1795. JLL Bulletin: Commentaries on the history of treatment evaluation
^ Ahmed, Akbar S. 2009. The First Anthropologist. Rain: RAI.
^ R.A. Fisher, "The Diesgn of Experiments", 1937, http://krishikosh.egranth.ac.in/bitstream/1/2040342/1/TNV-65.pdf
^ Gosnell, H. (1926). An Experiment in the Stimulation of Voting. American Political Science Review, 20(4), 869-874. doi:10.1017/S0003055400110524

Category:Design of experiments Category:Tests Category:Causal inference Category:Mathematical and quantitative methods (economics)

[1] Bruce D. Meyer, Natural and Quasi-Experiments in Economics , Journal of Business & Economic Statistics, Vol. 13, No. 2, JBES Symposium on Program and Policy Evaluation (Apr., 1995), pp. 151-161, https://www.jstor.org/stable/1392369

[2] ttps://www.jstor.org/stable/25098703?seq=1#page_scan_tab_contents David S. Lee, Enrico Moretti and Matthew J. Butler, 2004 Do Voters Affect or Elect Policies? Evidence from the U. S. House

[3] Donald Rubin, 2011, Causal Inference Using Potential Outcomes Design, Modeling, Decisions pgs 322-331, https://doi.org/10.1198/016214504000001880

[4] Par Nyman, 2017 "Door-to-door canvassing in the European elections: Evidence from a Swedish field experiment" https://doi.org/10.1016/j.electstud.2016.12.002

[5] Broockman, D., Kalla, J., & Sekhon, J. (2017). The Design of Field Experiments With Survey Outcomes: A Framework for Selecting More Efficient, Robust, and Ethical Designs. Political Analysis, 25(4), 435-464. doi:10.1017/pan.2017.27

[6] ttp://econ-www.mit.edu/files/800 Esther Duflo Field Experiments in Development Economics

[7] ttps://www.jstor.org/stable/3594915 Harrison and List, 2004

[8] ttps://www.jstor.org/stable/1806062?seq=1#page_scan_tab_contents LaLonde, 1986 Evaluating the Econometric Evaluations of Training Programs with Experimental Data

[9] ttps://papers.ssrn.com/sol3/papers.cfm?abstract_id=3033144 Gordon et al, 2017 A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook

[10] ttps://www.pnas.org/content/113/27/7353 Susan Athey and Guido Imbens, 2016 Recursive partitioning for heterogeneous causal effects https://doi.org/10.1073/pnas.1510489113

[11] ttp://www.economics.uci.edu/~ivan/asmb.874.pdf Steven Scott, 2010 “A modern Bayesian look at the multi-armed bandit.” Applied Stochastic Models in Business and Industry 26:639–658.

[12] ttps://arxiv.org/abs/1707.09727 Raj and Kalyani, 2017 Taming Non-stationary Bandits: A Bayesian Approach

[13] ttps://www.nber.org/papers/w21459 Dehejia, Rajeev, Cristian Pop-Eleches, Cyrus Samii. 2017. From Local to Global: External Validity in a Fertility Natural Experiment. NBER Working Paper 21459.

[14] ttps://scholar.princeton.edu/sites/default/files/negami/files/covselect.pdf Egami, Naoki, and Erin Hartman. 2018. Covariate Selection for Generalizing Experimental Results. Unpublished manuscript.

[15] Instrumental Variable Methods for Conditional Effects and Causal Interaction in Voter Mobilization Experiments Matthew Blackwell, 2015, https://doi.org/10.1080/01621459.2016.1246363

[16] Aronow, Peter M., and Alison Carnegie. 2013. Beyond LATE: Estimation of the Average Treatment Effect with an Instrumental Variable. Political Analysis 21:492–506.https://www.cambridge.org/core/journals/political-analysis/article/beyond-late-estimation-of-the-average-treatment-effect-with-an-instrumental-variable/604E0803793175CF88329DB34DAA80B3

[17] Aronow, Peter M., and Alison Carnegie. 2013. Beyond LATE: Estimation of the Average Treatment Effect with an Instrumental Variable. Political Analysis 21:492–506.https://www.cambridge.org/core/journals/political-analysis/article/beyond-late-estimation-of-the-average-treatment-effect-with-an-instrumental-variable/604E0803793175CF88329DB34DAA80B3

[18] Aronow, Peter M., and Cyrus Samii. 2017. Estimating Average Causal Effects Under General Interference, with Application to a Social Network Experiment. Forthcoming, Annals of Applied Statistics. https://projecteuclid.org/euclid.aoas/1514430272

[19] ttps://projecteuclid.org/euclid.aoas/1365527200 Lin, Winston. 2013. Agnostic Notes On Regression Adjustments to Experimental Data: Reexamining Freedman’s Critique. Annals of Applied Statistics 7(1): 295–318

[20] Miratrix, Luke W., Jasjeet S. Sekhon, and Bin Yu. 2013. Adjusting treatment effect estimates by post-stratification in randomized experiments. Journal of the Royal Statistical Society -- B 75(2): 369–396. https://dash.harvard.edu/bitstream/handle/1/30501585/Adjusting%20Treatment.pdf?sequence=1&isAllowed=y

[21] Woertman, Willem, et al. 2013. Stepped wedge designs could reduce the required sample size in cluster randomized trials. Journal of Clinical Epidemiology 66: 752-758 https://www.ncbi.nlm.nih.gov/pubmed/23523551

[22] ttp://www.jameslindlibrary.org/articles/james-lind-and-scurvy-1747-to-1795/ Tröhler U (2003). James Lind and scurvy: 1747 to 1795. JLL Bulletin: Commentaries on the history of treatment evaluation

[Ahmed,_Akbar_S_2009-23] Ahmed, Akbar S. 2009. The First Anthropologist. Rain: RAI.

[24] R.A. Fisher, "The Diesgn of Experiments", 1937, http://krishikosh.egranth.ac.in/bitstream/1/2040342/1/TNV-65.pdf

[25] Gosnell, H. (1926). An Experiment in the Stimulation of Voting. American Political Science Review, 20(4), 869-874. doi:10.1017/S0003055400110524

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

Characteristics of Field Experiments

Practical Uses

Limitations

Examples

See also

References