ADDA: a domain database with global coverage of the protein universe

AUTOR(ES)
FONTE

Oxford University Press

RESUMO

We used the Automatic Domain Decomposition Algorithm (ADDA) to generate a database of protein domain families with complete coverage of all protein sequences. Sequences are split into domains and domains are grouped into protein domain families in a completely automated process. The current database contains domains for more than 1.5 million sequences in more than 40 000 domain families. In particular, there are 3828 novel domain families that do not overlap with the curated domain databases Pfam, SCOP and InterPro. The data are freely available for downloading and querying via a web interface (http://ekhidna.biocenter.helsinki.fi:9801/sqgraph/pairsdb).

Documentos Relacionados