Linguistically motivated filter induction in information retrieval / Indução de filtros lingüisticamente motivados na recuperação de informação




Although Information Retrieval and Filtering tasks have always used basic Natural Language Processing (NLP) techniques for supporting document structuring, there is still space for more sophisticated NLP techniques which justify their cost when compared to the traditional approaches. This research aims to investigate some evidences that justify the hypothesis on which the use of linguistic-based methods is feasible and can bring on relevant contributions to this area. In this work noun phrases of a text are used as descriptors whose evidence is calculated by statistical methods. Filters are then induced to classify the retrieved documents by measuring their implicit relevance presupposed by an user profile. The increase of precision (efficacy) in IR systems as a consequence of the use of NLP techniques for text classification in the filtering task is an evidence of how this approach can be further explored


processamento de linguagem natural noun phrases aprendizado de máquina categorização de textos machine learning information retrieval filtragem de informação information filtering natural language processing text categotization sintagmas nominais recuperação de informação

