Desenvolvimento de uma metodologia para previsão de sítios de início de tradução
AUTOR(ES)
Cristiane Neri Nobre
DATA DE PUBLICAÇÃO
2007
RESUMO
The correct prediction of the translation start site in mRNA sequences is an im-portant task in genomic annotation. However, attaining a correct prediction is not trivial. Frequently the translation starts on the first AUG, but that is not a rule. Thus, this problem can be modeled as a classification problem between positive (co-ding sequences) and negative patterns (non coding sequences). To approach this problem the authors of this work propose the following methodology: (1) an alterna-tive extration of negative patterns; (2) using of shorter sequence window; (3) modi-fication of the codification for the nucleotides; (4) utilization of Smote - method for class balance, since the problem is highly unbalanced (1:29 fold in average) for the bases used in this work; (5) use of a transductive approach besides the traditional inductive inference; and finally, (6) use of the Support Vector Machine (SVM) classi-fier - with simple kernel functions. To test this methodology sequences collected by Petersen and Nielsen and RefSeq (Reference Sequences) sequences from NCBI (Na-tional Center for Biotechnology Information) from five organisms were used: Danio rerio, Drosophila melanogaster, Homo sapiens, Mus musculus and Rattus norvegicus, under six distinct inspection levels (reviewed, provisional, predicted, validated, mo-del and inferred). As a result, accuracy, adjusted accuracy, precision, sensitivity and specificity over 95% were attained, in average, by using negative patterns out of frame during training step, 24 nucleotide windows, codification by triples, pattern balancing with Smote, SVM classifier and by considering a scanning model, in which validation is tested up to TIS.
ASSUNTO(S)
ACESSO AO ARTIGO
http://hdl.handle.net/1843/GRFO-7P4LQ9Documentos Relacionados
- Uma nova metodologia hÃbrida inteligente para a previsÃo de sÃries temporais
- Metodologia para seleção de métodos de previsão de demanda
- Uma metodologia voltada para o desenvolvimento de lÃderes
- Uma metodologia de gestÃo para o desenvolvimento de software.
- Proposta de uma metodologia participativa para o desenvolvimento de software educacional