Avaliação da Relação entre Qualidade Perceptual da Fala e Taxa de Acerto de Sistemas de Reconhecimento de Fala em Ambientes Ruidosos
AUTOR(ES)
André Godoi Chiovato
DATA DE PUBLICAÇÃO
2005
RESUMO
The goal of this work is to evaluate the distortion of the noisy speech signal being after enhanced by noise-reduction algorithms. This is performed by comparison of word accuracy (%) of a standardized Automatic Speech Recognition (ASR) system and objective measures of perceptual speech quality (PESQ-MOS score), obtained after applying noise-reduction methods. The test scenario, composed of ETSI STQ-Aurora DSR Working Group database and a standardized ASR system, evaluated the following algorithms: WI008 (ETSI STQ-Aurora standard), EMSR (Ephraim and Malah noise Suppressor Rule Algorithm), NMT-PSS (Noise Masking Threshold Power Spectral Subtraction) and EMSR + NMT-PSS (EMSR algorithm with the concept of noise masking threshold). Moreover, a curve that models the relationship between PESQ-MOS score and Recognition Rate (%) is proposed. The purpose is to predict, under certain conditions, the system performance by means of the PESQ evaluation. This approximation is based in the Logistic Curve, which configuration parameters have physical meanings, validated by experimental results. Finally, some analysis are presented to indicate the advantages and disadvantages of several noise types present at Aurora 1 database over recognition system performance.
ASSUNTO(S)
reconhecimento de voz avaliação perceptual da fala recognition of speek algoritmo de realce da fala telecomunicacoes evaluation perceptual of speaks
ACESSO AO ARTIGO
http://tede.inatel.br/tde_busca/arquivo.php?codArquivo=111Documentos Relacionados
- Relação entre taxa de elocução e descontinuidade da fala na taquifemia
- Relação entre a gravidade da gagueira em crianças e a taxa de fala em suas mães
- Relação entre a qualidade do exame clínico e o acerto na requisição da radiografia de tórax
- Avaliação de diferentes tecnicas para reconhecimento da fala
- Investigação experimental usando algoritmos populacionais em ambientes ruidosos