Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory regions.
AUTOR(ES)
Stojanovic, N
RESUMO
Conserved segments in DNA or protein sequences are strong candidates for functional elements and thus appropriate methods for computing them need to be developed and compared. We describe five methods and computer programs for finding highly conserved blocks within previously computed multiple alignments, primarily for DNA sequences. Two of the methods are already in common use; these are based on good column agreement and high information content. Three additional methods find blocks with minimal evolutionary change, blocks that differ in at most k positions per row from a known center sequence and blocks that differ in at most k positions per row from a center sequence that is unknown a priori. The center sequence in the latter two methods is a way to model potential binding sites for known or unknown proteins in DNA sequences. The efficacy of each method was evaluated by analysis of three extensively analyzed regulatory regions in mammalian beta-globin gene clusters and the control region of bacterial arabinose operons. Although all five methods have quite different theoretical underpinnings, they produce rather similar results on these data sets when their parameters are adjusted to best approximate the experimental data. The optimal parameters for the method based on information content varied little for different regulatory regions of the beta-globin gene cluster and hence may be extrapolated to many other regulatory regions. The programs based on maximum allowed mismatches per row have simple parameters whose values can be chosen a priori and thus they may be more useful than the other methods when calibration against known functional sites is not available.
ACESSO AO ARTIGO
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=148654Documentos Relacionados
- Searching databases of conserved sequence regions by aligning protein multiple-alignments.
- Evolution of the casein multigene family: conserved sequences in the 5' flanking and exon regions.
- Upstream regulatory sequences of immunoglobulin genes are recognized by nuclear proteins which also bind to other gene regions.
- Analysis of the murine Hox-2.7 gene: conserved alternative transcripts with differential distributions in the nervous system and the potential for shared regulatory regions.
- Amplification of plant U3 and U6 snRNA gene sequences using primers specific for an upstream promoter element and conserved intragenic regions.