Conserved Domain Database

A Conserved Domain Database ou CDD (Base de datos de Dominios Conservados) é unha base de datos de modelos de aliñamentos de secuencias múltiples ben anotados e modelos de busca de bases de datos derivadas, de dominios proteicos antigos evolutivamente e proteínas completas.^[1] Entre os contidos do CDD están os modelos de dominios curados manualmente do NCBI e modelos de dominios importados de varias bases de datos de fontes externas (Pfam, SMART, COG, PRK, TIGRFAMs). É característico dos dominios curados polo NCBI que usen información de estruturas en 3D para definir explicitamente as fronteiras entre dominios, aliñar bloques, emendar detalles de aliñamentos, e proporcionar unha panorámica das relacións en secuencia/estrutura/función. Os modelos curados manualmente están organizados xerarquicamente se describen familias de dominios que están claramente relacionados por teren un antepasado común. Para que os datos non sexan redundantes, o CDD agrupa os modelos de dominios procedentes de varias fontes en superfamilias.

Obxectivos

Domains can be thought of as distinct functional and/or structural units of a protein. These two classifications coincide rather often, as a matter of fact, and what is found as an independently folding unit of a polypeptide chain also carries specific function. Domains are often identified as recurring (sequence or structure) units, which may exist in various contexts. In molecular evolution such domains may have been utilized as building blocks, and may have been recombined in different arrangements to modulate protein function. CDD defines conserved domains as recurring units in molecular evolution, the extents of which can be determined by sequence and structure analysis.

The goal of the NCBI conserved domain curation project is to provide database users with insights into how patterns of residue conservation and divergence in a family relate to functional properties, and to provide useful links to more detailed information that may help to understand those sequence/structure/function relationships. To do this, CDD Curators include the following types of information in order to supplement and enrich the traditional multiple sequence alignments that form the foundation of domain models: 3-dimensional structures and conserved core motifs, conserved features/sites, phylogenetic organization, links to electronic literature resources.

Consulta da base de datos

The collection is also part of NCBI’s Entrez query and retrieval system, crosslinked to numerous other resources. CDD provides annotation of domain footprints and conserved functional sites on protein sequences. Precalculated domain annotation can be retrieved for protein sequences tracked in NCBI’s Entrez system, and CDD’s collection of models can be queried with novel protein sequences via * "the CD-Search service". United States National Center for Biotechnology Information. , or at* "the Batch CD-Search". United States National Center for Biotechnology Information. , that allows the computation and download of annotation for large sets of protein queries.

Notas

↑ Marchler-Bauer, A.; Zheng, C.; Chitsaz, F.; Derbyshire, M. K.; Geer, L. Y.; Geer, R. C.; Gonzales, N. R.; Gwadz, M.; Hurwitz, D. I.; Lanczycki, C. J.; Lu, F.; Lu, S.; Marchler, G. H.; Song, J. S.; Thanki, N.; Yamashita, R. A.; Zhang, D.; Bryant, S. H. (2012). "CDD: Conserved domains and protein three-dimensional structure". Nucleic Acids Research 41 (Database issue): D348–D352. PMC 3531192. PMID 23197659. doi:10.1093/nar/gks1243.

Véxase tamén

Ligazóns externas

"Conserved Domains Database (CDD) and Resource Group". United States National Center for Biotechnology Information.

[CDD_reference-1] Marchler-Bauer, A.; Zheng, C.; Chitsaz, F.; Derbyshire, M. K.; Geer, L. Y.; Geer, R. C.; Gonzales, N. R.; Gwadz, M.; Hurwitz, D. I.; Lanczycki, C. J.; Lu, F.; Lu, S.; Marchler, G. H.; Song, J. S.; Thanki, N.; Yamashita, R. A.; Zhang, D.; Bryant, S. H. (2012). "CDD: Conserved domains and protein three-dimensional structure". Nucleic Acids Research 41 (Database issue): D348–D352. PMC 3531192. PMID 23197659. doi:10.1093/nar/gks1243.

[1]