Busca e recuperaÃÃo de componentes de software utilizando agrupamento de componentes

AUTOR(ES)
DATA DE PUBLICAÇÃO

2008

RESUMO

The development of software with reuse is an approach that can be used to achieve two main benefits: (1) increase in productivity of software projects and (2) improvements in the final quality of the products. The software reuse approach can be instantiated through the strategy of development based in software components. According to this strategy, large software applications can be developed from reusable and pre-existing parts, which must collaborate to perform the functionality required by the application. The places where these components are stored (repositories) and the search and retrieval processes are considered points of continuous research and discussion. In another context, solutions based on machine learning and artificial intelligence are beginning to produce relevant contributions for problems belonging to the development cycle of software projects, including such fields as the estimate software project effort and the automatic prediction of failures. This work aimed to investigate the use of clustering techniques (a subset of machine learning techniques) to the problem of software reuse. We considered in this work the following clustering techniques: 1) self-organizing maps (SOM), 2) growing hierarquical self-organizing maps (GHSOM) and 3) suffix tree clustering (STC). It is important to stress that this is the first work that applies STC to the problem of software component repositories search and retrieval. We implemented a prototype of aWeb tool for search and retrieval of software components based on STC, named Cluco. The Cluco tool (Clustering of Components) presents the results of a search by components, which meet the criteria for a query, in the format of clusters of similar components where these clusters are generated by the STC algorithm. This feature can be considered an important contribution because the manual effort in search by similarities which would, otherwise, be carried out by users is performed automatically by the system, as soon as the results of a search become available. We describe a number of qualitative and quantitative evaluations of the proposed method based on STC. Users with various levels of experience in software engineering evaluated the tool through carrying out searches and responding to a questionnaire containing questions related to the usability and the quality of the search results. Evaluation metrics for retrieval information systems such as the recall and precision metrics were used to provide quantitative validations of the proposed method. A performance analysis comparing the techniques investigated in this work was carried out. The results show the superiority of STC in the clustering problem of the software components used in this work (components Java). Considering all the results obtained we conclude that the proposed solution contributes in a positive and relevant way for the problem of search and retrieval of software components

ASSUNTO(S)

software engenharia de software artificial intelligence software inteligÃncia artificial clusterin ciencia da computacao agrupamento aprendizagem de mÃquina software engineering software reuse reuso de software machine learning

Documentos Relacionados