Significance of similarities in patterns: an application to beta interferon-related DNA on human chromosome 2.

AUTOR(ES)
RESUMO

The nucleotide sequence of a 14-kilobase (kb) region of the human beta interferon (IFN-beta)-related DNA locus on chromosome 2 (genomic DNA clone lambda B3) was determined and compared to that of the IFN-beta 1 gene by using the Sellers TT algorithm. This algorithm aligns segments of one sequence with similar segments in a second sequence. A strategy was developed for assessing the significance of similarities between DNA sequences based on a scheme that recognizes patterns or runs of identities within an alignment. The pattern score (II) thus obtained is an entropy-like measure. Numerically it is a reflection of the length of the second longest run of identity in an alignment plus a correction factor due to the other shorter identity runs in the alignment. When the IFN-beta 1 gene is compared to a random nucleotide sequence, the distribution of II scores in such comparisons fits a Gaussian function. This strategy has been used to identify seven segments along one strand of lambda B3 DNA that are related to segments in IFN-beta 1; these seven alignments have II scores greater than or equal to 3 standard deviations above the mean score obtained in comparisons between IFN-beta 1 and random nucleotide sequences. One of these alignments (section 7) has a II score 8.02 standard deviations above this mean score. The likelihood of finding an alignment statement as good as that in section 7 in a random sequence the length of the human genome is approximately 10(-7). Furthermore, the lambda B3 DNA sequence in section 7 selects the human IFN-beta 1 gene as the most significant alignment in computer searches of mammalian nucleotide sequence data bases.

Documentos Relacionados