Aplicação de técnicas de validação estatística e biológica em agrupamento de dados de expressão gênica

AUTOR(ES)
DATA DE PUBLICAÇÃO

2006

RESUMO

The exponential growth of gene expression data resulting from DNA microarray technology is accompanied by an increase of demand of ecient computational tools that hel p the analysis process and the interpretation of these data. Clustering techniques are useful tools for analysis of gene expression data because they can identify patterns among gene expression profiles that can potentially hold m eaning biological information, such as gene function and the biological processes involved. In this work it was applied the unidimensional k-means and SOM clustering algorithms and bidimensional SAMBA algorithm, with dierent parameters and on two dierent da tabases of gene expression. Clustering results analysis is very sensible and even subjective because they involve a huge set of heterogeneous factors. For this reason, the choice of best clustering solution was made using dierent statistical and biologic validations techniques. The combination of dierent clustering algorithms, parameters, databases and statistical and biological validation techniques confirmed some advantages and disadvantage s of clustering algorithms and also showed which statistical indexes were corroborated by the biological validation process. Furthermore, high homogeneity gene expression clusters were formed. With the biological function and transcription factor binding site determination we were able to evidence the relation of these genes in the same function or biologi cal process.

ASSUNTO(S)

biologia molecular ciencia da computacao programação genética (computação) genômica biologia computacional

Documentos Relacionados