Data generating process
Appearance
The term data generating process is used in statistical and scientific lierature to convey a number of different ideas:
- the data collection process, being routes and procedures by which data reach a database (particularly where these may change over time);
- a specific statistical model that is being used to represent supposed random variations in observations, often in terms of explanatory and/or latent variables
- a general but non-specific model (not directly or explicitly set down) that notionally includes all of the random influences that combine together to lead to individual observations, where one instance would be the supposed justification of the "common occurence" of the normal distribution in terms of a combination of multiple random additive effects: see central limit theorem.