User:Kdabug/sandbox
![]() | This is a user sandbox of Kdabug. You can use it for testing or practicing edits. This is not the place where you work on your assigned article for a dashboard.wikiedu.org course. Visit your Dashboard course page and follow the links for your assigned article in the My Articles section. |
EDITING FROM ARTICLE "FIELD EXPERIMENTS"
A field experiment applies the scientific method to experimentally examine researcher-designed interventions in real-world environments rather than in a laboratory setting. Field experiments, like lab experiments, generally randomize subjects (or other sampling units) into either treatment or control groups and use information gathered from both groups to test claims of causal relationships. Field experiments allow researchers opportunities to test context-specific hypotheses while minimizing reliance on assumptions. Field experiments are based on the notion that empirical research will show if exogenous variables (also called independent or instrumental variables have an effect on the behavior of data gathered regarding a different (dependent) variable.
The term 'field' in 'field experiment' is a defining feature that allows researchers to see how subjects respond to interventions in environments that accurately reflect the distribution of treatment outcomes in the real world. This is in contrast to laboratory experiments, which enforce scientific control by testing a hypothesis in the artificial and highly controlled setting of a laboratory. Field experiments have some contextual differences as well from naturally occurring experiments and quasi-experiments[1]. Where naturally-occurring experiments occur when an external force (e.g. a government, nonprofit, etc.) decides the randomization of sorting subjects into the treatment group, field experiments require researchers to have control over randomization and implementation methods for treatment. Similarly, quasi-experiments occur when no particular entity has the authority to separate groups into treated and control subjects. This includes U.S. Congressional districts where candidates win with slim-margins (as seen in the seemingly random behavior separates near-winners from near losers [2] weather patterns, natural disasters, and other near-random phenomena).
Field experiments encompass a broad array of experimental designs, each with varying degrees of generality. Some criteria of generality (e.g. authenticity of treatments, participants, contexts, and outcome measures) refer to the contextual similarities between the subjects in the experimental sample and the rest of the population. This allows field experiments to be used in a number of settings; however, they are often used in the social sciences, especially in economic analyses of education and health interventions.
Examples of diverse field experiments include:
- Clinical trials of pharmaceuticals are one example of field experiments.
- Economists have used field experiments to analyze discrimination, health care programs, charitable fundraising, education, information aggregation in markets, and microfinance programs.
- Engineers often conduct field tests of prototype products to validate earlier laboratory tests and to obtain broader feedback.
- Social science researchers use field experiments to see how humans interact and respond to changes in social, political, and governmental patterns.
Characteristics of Field Experiments
In a randomized field experiment, researchers would separate participants into two or more groups: a treatment group (or groups) and a control group. Members of the treatment group(s) then receive a particular development intervention being evaluated while the control group does not. Field experiment researchers assume that subjects sorted into different treatment or control groups have both observable and unobservable potential outcomes [3], and the causal effect of the treatment for each subject is the difference between these two outcomes. Since researchers can only measure the observed outcome for each subject, researchers use field experiments to find an unbiased estimator of the average treatment effect on a random sample of the population. Often, researchers will find the difference between the mean observed outcomes of the treated and control groups (the difference-in-means estimator); however, analyzing whether or not this (or any) estimator is unbiased requires looking to the design, randomization, and implementation of the intervention. This means that not only should the subjects be a random subset of the population, but also that the subjects need to be randomly assigned to either the treated or control groups.
Along with randomization of subjects into treated and nontreated groups, two other core assumptions underly the ability of the researcher to collect unbiased potential outcomes: excludability and non-interference. The excludability assumption provides that the only relevant causal agent is through the receipt of the treatment. Asymmetries in assignment, administration or measurement of treatment and control groups violate this assumption. The non-interference assumption, or Stable Unit Treatment Value Assumption (SUTVA), indicates that the value of the outcome depends only on whether or not the subject is assigned the treatment and not whether or not other subjects are assigned to the treatment. When these three core assumptions are met, researchers (are more likely to?) provide unbiased estimates through field experiments.
After designing the field experiment and gathering the data, researchers can use statistical inference tests to determine the size and strength of the effect the intervention has on the subjects. Field experiments allow researchers to collect diverse amounts and types of data. For example, a researcher could design an experiment that uses pre- and post-trial information in an appropriate statistical inference method to see if an intervention has an effect on subject-level changes in outcomes.
Practical Uses
Field experiments are seen by some academics as a rigorous way of testing general theories about economic and political behavior. Field experiments have gained popularity in the field because they allow researchers to guard against selection bias. Selection bias refers to the fact that, in non-experimental settings, the group receiving a development intervention is likely different from a group that is not receiving the intervention. This may occur because of characteristics that make some people more likely to opt into a program, or because of program targeting.
Development economists have used field experiments to measure the effectiveness of poverty and health programs in developing countries. Organizations such as the Abdul Latif Jameel Poverty Action Lab (J-PAL) at the Massachusetts Institute of Technology, the Center of Evaluation for Global Action at the University of California, and Innovations for Poverty Action (IPA) in particular have received attention for their use of randomized field experiments to evaluate development programs. The aim of field experiments used in development research is to find causal relationships between policy interventions and development outcomes.
Along with theory and effectiveness testing, field experiments can act as benchmarks for comparing observational data to experimental results. This comparison can help determine levels of bias in observational studies, and, since researchers develop a hypothesis from an a priori judgment, benchmarks can help to add credibility to a study.[4] While some argue that covariate adjustment or matching designs might work just as well in eliminating bias, field experiments can increase certainty by displacing omitted variable bias because they better allocate observed and unobserved factors.Cite error: The <ref>
tag has too many names (see the help page).
Building on using field experiments as benchmarks, researchers can utiliMachine learning methods have allowed researchers to simulate, reweight, and generalize experimental data. [5] This increases the speed and efficiency of gathering experimental results and reduces the costs of implementing the experiment. Another cutting-edge technique in field experiments is the use of the multi armed bandit design[6], including similar adaptive designs on experiments with variable outcomes and treatments over time. [7]
Limitations
There are limitations of and arguments against using field experiments in place of other research designs (e.g lab experiments, survey experiments, observational studies, etc). Some academics dispute the claim that findings from field experiments are sufficient for establishing and testing theories about behavior. In particular, a hotly contested issue with regards to field experiments is their external validity.[8] Given that field experiments necessarily take place in a specific geographic and political setting, the extent to which findings can be extrapolated to formulate a general theory regarding economic behavior is a concern. However, researchers have begun to find strategies to effectively generalize causal effects outside of the sample setting by comparing the environments of the treated population and external population, accessing information from a large sample size, and accounting and modeling for treatment effects heterogeneity within the sample. [9] Others have used covariate blocking techniques to generalize from field experiment populations to external populations. <ref/ https://scholar.princeton.edu/sites/default/files/negami/files/covselect.pdf Egami, Naoki, and Erin Hartman. 2018. Covariate Selection for Generalizing Experimental Results. Unpublished manuscript. </ref>
There are some logistical limitations to data collection and unbiased statistical estimates in field experiments, and some scholars argue that these limitations make lab experiments a superior option for finding causal relationships. Difficulties affecting field experiments (one-sided noncompliance [10][11]) difficulties occur when subjects who are assigned to treatment never receive treatment. This occurs when subjects who are assigned to treatment are hard to reach or when subjects refuse treatment. Similarly, there are cases where experiments have subjects assigned to control groups who mistakenly receive treatment [two-sided noncompliance. Other problems to data collection include attrition (where subjects who are treated do not provide outcome data) which, under certain conditions, can also bias the collected data. These problems can lead to imprecise data analysis; however, researchers who use field experiments can use statistical methods in calculating useful information even when these difficulties occur[12].
Similarly, conducting a field experiment can also lead to concerns over interference[13] between subjects. When a treated subject or groups affects the outcomes of the nontreated groups (through contagion, displacement, communication, social comparison, deterrence, etc.), nontreated groups might not have an outcome that is the true untreated outcome. A subset of interference is the spillover effect, which occurs when the treatment of treated groups has an effect on neighboring untreated groups.
There is also the concern of heterogeneous treatment effects, which occur when subjects with similar attributes react to treatment categorically differently than other subjects. This creates across-treatment variability that can present problems for assessing causal mechanisms, leading some researchers to use covariate adjustment methods. Critics argue that this increases variability, error, and potential bias to results; however, others show that increasing the sample size of the data will dull the negative effects of using these and other regressive methods.[14] Another popular method to address heterogeneous treatment effects is to use a block design, which stratifies subjects based on observable traits and randomizes treatment and control within those strata and can be effectively done before or after implementing an intervention[15].
Some limitations exist to the design and implementation of field experiments as well. Field experiments are expensive and time-consuming to conduct. If not planned carefully beforehand, field experiments might miss crucial data or return biased estimates. At any point in the implementation of the field experiment, the researcher can face problems that (if left uncorrected) can ruin the entire process. Not only should the researcher be aware of randomization of subjects into groups, but also the how subjects will see the fairness of randomization (e.g. in 'negative income tax' experiments communities may lobby for their community to get a cash transfer so the assignment is not purely random). Similarly, those who implement the randomization could contaminate the randomization scheme. However, ethical considerations can factor into experimental designs, and field experiments can adopt a "stepped-wedge" design that will eventually give the entire sample access to the intervention on different timing schedules [16]. As well, researchers can design a blinded field experiment to remove possibilities of manipulation.
There is also a certain difficulty of replicability (field experiments often require special access or permission, or technical detail—e.g., the instructions for precisely how to replicate a field experiment are rarely if ever available in economics). There is a limit on a researcher's ability to obtain the informed consent of all participants. Field testing is always less controlled than laboratory testing. This increases the variability of the types and magnitudes of stress in field testing. The resulting data, therefore, are more varied: larger standard deviation, less precision and accuracy, etc. This leads to the use of larger sample sizes for field testing. However, others argue that, even though replicability is difficult, if the results of the experiment are important then there a larger chance that the experiment will get replicated.
Noteworthy Field Experiments
The history of experiments in the lab and the field has left longstanding impacts in the physical, natural, and life sciences.
Field experiments in the physical sciences and clinical research:
- Geology has a long history of field experiments, since the time of Avicenna,[citation needed]
- Anthropology field experiments date back to Biruni's study of India.[17]
- Social psychology has pioneering figures who utilized field experiments include Philip Zimbardo, Kurt Lewin and Stanley Milgram. In the 1700s, James Lind utilized a controlled field experiment to identify a treatment for scurvy using different interventions.[18]
Field experiments in economics: In economics, Peter Bohm, University of Stockholm, was one of the first economists to take the tools of experimental economic methods and attempt to try them with field subjects. In the area of development economics, the pioneering work of Hans Binswanger in the late 1970s conducting experiments in India on risk behavior [1] should also be noted. The use of field experiments in economics has grown recently with the work of John A. List, Jeff Carpenter, Juan-Camilo Cardenas, Abigail Barr, Catherine Eckel, Michael Kremer, Paul Gertler, Glenn Harrison, Colin Camerer, Bradley Ruffle, Abhijit Banerjee, Esther Duflo, Dean Karlan, Edward "Ted" Miguel, Sendhil Mullainathan, David H. Reiley, among others.
Field experiments in political science:
See also
References
- ^ Bruce D. Meyer, Natural and Quasi-Experiments in Economics , Journal of Business & Economic Statistics, Vol. 13, No. 2, JBES Symposium on Program and Policy Evaluation (Apr., 1995), pp. 151-161, https://www.jstor.org/stable/1392369
- ^ https://www.jstor.org/stable/25098703?seq=1#page_scan_tab_contents David S. Lee, Enrico Moretti and Matthew J. Butler, 2004 Do Voters Affect or Elect Policies? Evidence from the U. S. House
- ^ Donald Rubin, 2011, Causal Inference Using Potential Outcomes Design, Modeling, Decisions pgs 322-331, https://doi.org/10.1198/016214504000001880
- ^ https://www.jstor.org/stable/3594915 Harrison and List, 2004
- ^ https://www.pnas.org/content/113/27/7353 Susan Athey and Guido Imbens, 2016 Recursive partitioning for heterogeneous causal effects https://doi.org/10.1073/pnas.1510489113
- ^ http://www.economics.uci.edu/~ivan/asmb.874.pdf Steven Scott, 2010 “A modern Bayesian look at the multi-armed bandit.” Applied Stochastic Models in Business and Industry 26:639–658.
- ^ https://arxiv.org/abs/1707.09727 Raj and Kalyani, 2017 Taming Non-stationary Bandits: A Bayesian Approach
- ^ http://econ-www.mit.edu/files/800 Esther Duflo Field Experiments in Development Economics
- ^ https://www.nber.org/papers/w21459 Dehejia, Rajeev, Cristian Pop-Eleches, Cyrus Samii. 2017. From Local to Global: External Validity in a Fertility Natural Experiment. NBER Working Paper 21459.
- ^ Instrumental Variable Methods for Conditional Effects and Causal Interaction in Voter Mobilization Experiments Matthew Blackwell, 2015, https://doi.org/10.1080/01621459.2016.1246363
- ^ Aronow, Peter M., and Alison Carnegie. 2013. Beyond LATE: Estimation of the Average Treatment Effect with an Instrumental Variable. Political Analysis 21:492–506.https://www.cambridge.org/core/journals/political-analysis/article/beyond-late-estimation-of-the-average-treatment-effect-with-an-instrumental-variable/604E0803793175CF88329DB34DAA80B3
- ^ Aronow, Peter M., and Alison Carnegie. 2013. Beyond LATE: Estimation of the Average Treatment Effect with an Instrumental Variable. Political Analysis 21:492–506.https://www.cambridge.org/core/journals/political-analysis/article/beyond-late-estimation-of-the-average-treatment-effect-with-an-instrumental-variable/604E0803793175CF88329DB34DAA80B3
- ^ Aronow, Peter M., and Cyrus Samii. 2017. Estimating Average Causal Effects Under General Interference, with Application to a Social Network Experiment. Forthcoming, Annals of Applied Statistics. https://projecteuclid.org/euclid.aoas/1514430272
- ^ https://projecteuclid.org/euclid.aoas/1365527200 Lin, Winston. 2013. Agnostic Notes On Regression Adjustments to Experimental Data: Reexamining Freedman’s Critique. Annals of Applied Statistics 7(1): 295–318
- ^ Miratrix, Luke W., Jasjeet S. Sekhon, and Bin Yu. 2013. Adjusting treatment effect estimates by post-stratification in randomized experiments. Journal of the Royal Statistical Society -- B 75(2): 369–396. https://dash.harvard.edu/bitstream/handle/1/30501585/Adjusting%20Treatment.pdf?sequence=1&isAllowed=y
- ^ Woertman, Willem, et al. 2013. Stepped wedge designs could reduce the required sample size in cluster randomized trials. Journal of Clinical Epidemiology 66: 752-758 https://www.ncbi.nlm.nih.gov/pubmed/23523551
- ^ Ahmed, Akbar S. 2009. The First Anthropologist. Rain: RAI.
- ^ http://www.jameslindlibrary.org/articles/james-lind-and-scurvy-1747-to-1795/ Tröhler U (2003). James Lind and scurvy: 1747 to 1795. JLL Bulletin: Commentaries on the history of treatment evaluation
- ^ https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3033144 Gordon et al, 2017 A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook
Category:Design of experiments
Category:Tests
Category:Causal inference
Category:Mathematical and quantitative methods (economics)