Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
Coding Methods for the NMF Approach to Speech Recognition and Vocabulary Acquisition
oleh: Meng Sun, Hugo Van Hamme
| Format: | Article |
|---|---|
| Diterbitkan: | International Institute of Informatics and Cybernetics 2012-12-01 |
Deskripsi
This paper aims at improving the accuracy of the non- negative matrix factorization approach to word learn- ing and recognition of spoken utterances. We pro- pose and compare three coding methods to alleviate quantization errors involved in the vector quantization (VQ) of speech spectra: multi-codebooks, soft VQ and adaptive VQ. We evaluate on the task of spotting a vocabulary of 50 keywords in continuous speech. The error rates of multi-codebooks decreased with increas- ing number of codebooks, but the accuracy leveled off around 5 to 10 codebooks. Soft VQ and adaptive VQ made a better trade-off between the required memory and the accuracy. The best of the proposed methods reduce the error rate to 1.2% from the 1.9% obtained with a single codebook. The coding methods and the model framework may also prove useful for applica- tions such as topic discovery/detection and mining of sequential patterns.