Jump to content

User:Modevs/sandbox

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Modevs (talk | contribs) at 21:34, 23 February 2013. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Design philosophy for business intelligence databases

A business intelligence (BI) database is a database that is oriented toward maximizing reporting and analytic value of data by aggregating and summarizing important pieces of existing data in such a way that large reporting queries can run more efficiently.

Business intelligence targeted databases differ from operational databases in that a BI database is designed to derive business information from existing data[1], which can be used to report on or gather meaning by aggregating existing data sources.

While operational databases are focused around data-input queries to accurately and in a normalized fashion manage data, business intelligence databases are focused on data-output, at the expense of normalization creating redundant derived data. Since data often will not be updated, this is a preferable and the requirements for BI databases are different than for example data entry requirements, denormalizing allows for faster reporting queries.[2]

Comparison of goals

The following table compares the goals of operational and BI-oriented databases.[3]

Data Sources

BI databases can derive data from a wide array of source. Examples that BI databases might aggregate existing data from include sources such as Point of sale applications, human resource applications, and survey results.[4] Data is generally aggregated using extract, transform and load tools.

Operations Business Intelligence
Minimize data redundancy (normalization) Optimize data to run quickly with a wide range of queries using mass amounts of data; de-normalized to provide quick retrieval of data.
Requires sub-second response time Response time is important however as queries generally will take seconds, minutes or even hours optimization is aimed towards minimizing the impact of these large queries
Minimal derived data, derived data is typically generated on-demand Large amounts of derived data to speed up report-based queries
Historical data often sacrificed for speed Large amounts of historical data purposely kept to generate temporal reports

Logical Design

Because of the specific goals of BI databases, database schemas such as the star schema and snowflake schema are often used to speed up queries, at the expense of initial or ongoing load time in managing a database. The decision on what type of database schema to use relies heavily on data aggregation and summarization requirements.

Backup and Recovery

Since BI databases are derived from existing data, it is often possible to recreate them from source data should they fail. However due to the amount of time it may take to recreate a BI database, and the likely need to utilize Extract, transform, load tools in order to recreate data, disaster recovery is still a considerable issue.

  1. ^ "Business Intelligence for the Enterprise". Retrieved 2013-02-23.
  2. ^ "Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications".
  3. ^ "Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications".
  4. ^ "Business Intelligence : Your Data Storage".