Machine-generated data
Machine Generated Data (MGD) is the generic term for information which was automatically created from a computer process, application, or other machine without the intervention of a human. While Machine Generated Data can be created due to some action by a human, it excludes data manually entered by an end user[1]. Machine generated data crosses all industry sectors, and humans increasingly generate the data unknowingly [2].
Relevance of Machine Generated Data
Machine generated data tends to be amorphous; typically, users never modify this data. Machines often generate this data as a consistent response to an event which occurred. Since the event is historical, the data is less prone to updates and modifications. Partly because of this quality, the U.S. court systems consider machine generated data as highly reliable.[3].
Handling Machine Generated Data
In 2009, Gartner published that data will grow by 650% over the following five years.[4]. Most of the growth in data is the byproduct of machine generated data.[1].
Processing Machine Generated Data
Given the fairly static yet voluminous nature of Machine Generated Data, data owners rely on highly scalable tools to process and analyze the resulting dataset. Almost all machine generated data is structured[1], so the ETL processing can be fairly simple. The challenge lies mostly with data analytics. Given high performance requirements along with large data sizes, traditional database indexing and partitioning limits the size and history of the dataset for processing. Alternative approaches exist with columnar databases as only particular "columns" of the dataset would be accessed during particular analysis.[5]
Examples of Machine Generated Data
- Web logs [6]
- Call detail records [6]
- Financial instrument trades [6]
- Network event logs [6]
- SEIM logs
- Telemetry collected by the government [6]
Notes
Reference List
Bibliography
This article has not been added to any content categories. Please help out by adding categories to it so that it can be listed with similar articles. (December 2010) |