Jump to content

Integrative bioinformatics

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Bookverma (talk | contribs) at 16:25, 28 October 2012 (External links: Adding another resource for further study). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Integrative bioinformatics is a discipline of bioinformatics that focuses on problems of data integration for the life sciences.

With the rise of high-throughput (HTP) technologies in the life sciences, particularly in molecular biology, the amount of collected data has grown in an exponential fashion. Furthermore, the data are scattered over a plethora of both public and private repositories, and are stored using a large number of different formats. This situation makes searching these data and performing the analysis necessary for the extraction of new knowledge from the complete set of available data very difficult. Integrative bioinformatics attempts to tackle this problem by providing unified access to life science data.


Approaches

Semantic web approaches

In the Semantic Web approach, data from multiple websites or databases is searched via metadata. Metadata is machine-readable code, which allows the program to compare search terms more accurately and should decrease the number of results that are irrelevant or unhelpful. Some meta-data exists as definitions called ontologies, which serve to facilitate searches by using key terms or phrases to find and return the data [1]. Advantages of this approach include the general increased quality of the data returned in searches and the use of ontologies that attempt to link entries that may not explicitly state the search term but are still relevant. One disadvantage of this approach is that the results that are returned come in the format of the database of their origin and as such, direct comparisons may be difficult. In addition, the semantic web approach is still considered an emerging technology and is not in wide-scale use at this time [2].

Data warehousing approaches

In the data warehousing strategy, the data from different sources are extracted and integrated in a single database. For example, various ‘omics’ datasets may be integrated to provide biological insights into biological systems. Examples include data from genomics, transcriptomics, proteomics, interactomics, metabolomics. Ideally, changes in these sources are regularly synchronized to the integrated database. The data is presented to the users in a common format. Many programs aimed to aid in the creation of such warehouses are designed to be extremely versatile to allow for them to be implemented in diverse research projects [3]. One advantage of this approach is that data is available for analysis at a single site, using a uniform schema. Some disadvantages are that the datasets are often huge and difficult to keep up to date. Another problem with this method is that it is costly to compile such a warehouse [4].

Other approaches

See also

Notes

  1. ^ Doms, A. & Schroeder, M. (2005). “GoPubMed: exploring PubMed with the Gene Ontology.” Nucleic Acid Research. Retrieved 28 September 2012.
  2. ^ Ruttenberg, et al. (2007). “Advancing translational research with the Semantic Web.” BMC Bioinformatics. Retrieved 28 September 2012
  3. ^ Shah, et al. (2005). “Atlas – a data warehouse for integrative bioinformatics.” BMC Bioinformatics. Retrieved 30 September 2012.
  4. ^ Kuenne, et al. (2007). “Using Data Warehouse Technology in Crop Plant Bioinformatics.” Journal of Integrative Bioinformatics. Retrieved 30 September 2012.

References