User:Modevs/sandbox
Design philosophy for business intelligence databases
A business intelligence (BI) database is a database that is oriented toward maximizing reporting and analytic value of data by aggregating and summarizing important pieces of existing data in such a way that large reporting queries can run more efficiently.
Business intelligence targeted databases differ from operational databases in that a BI database is designed to derive business information from existing data[1], which can be used to report on or gather meaning by aggregating existing data sources.
While operational databases are focused around data-input queries to accurately and in a normalized fashion manage data, business intelligence databases are focused on data-output, at the expense of normalization creating redundant derived data. Since data often will not be updated, this is a preferable and the requirements for BI databases are different than for example data entry requirements, denormalizing allows for faster reporting queries[2].
Data Sources
Examples of sources that BI databases might aggregate existing data from include sources such as Point of sale applications, human resource applications, and survey results[3].
Comparison of goals
The following table compares the goals of operational and BI-oriented databases.[4]
Operations | Business Intelligence |
---|---|
Minimize data redundancy (normalization) | Optimize data to run quickly with a wide range of queries using mass amounts of data; de-normalized to provide quick retrieval of data. |
Requires sub-second response time | Response time is important however as queries generally will take seconds, minutes or even hours optimization is aimed towards minimizing the impact of these large queries |
Minimal derived data, derived data is typically generated on-demand | Large amounts of derived data to speed up report-based queries |
Historical data often sacrificed for speed | Large amounts of historical data purposely kept to generate temporal reports |
Logical Design
Because of the specific goals of BI databases, database schemas such as the star schema and snowflake schema are often used to speed up queries, at the expense of initial or ongoing load time in managing a database. The decision on what type of database schema to use relies heavily on data aggregation and summarization requirements.
Backup and Recovery
Since BI databases are derived from existing data, it is often possible to recreate them from source data should they fail. However due to the amount of time it may take to recreate a BI database, and the likely need to utilize Extract, transform, load tools in order to recreate data, disaster recovery is still a considerable issue.
- ^ "Business Intelligence for the Enterprise". Retrieved 2013-02-23.
- ^ "Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications".
- ^ "Business Intelligence : Your Data Storage".
- ^ "Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications".