Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition

oleh: YANG Run-yan, CHENG Gao-feng, LIU Jian

Format:	Article
Diterbitkan:	Editorial office of Computer Science 2022-01-01

Deskripsi

In the past decade,end-to-end automatic speech recognition (ASR) frameworks have developed rapidly.End-to-end ASR has shown not only very different characteristics from traditional ASR based on hidden Markov models (HMMs),but also advanced performances.Thus,end-to-end ASR is being more and more popular and has become another major type of ASR frameworks.A keyword search (KWS) framework based on end-to-end ASR and frame-synchronous alignment is proposed for solving the problem that end-to-end ASR cannot provide accurate keyword timestamps and confidence scores,and experimental verification on a Vietnamese dataset is made.First,utterances are decoded by an end-to-end Uyghur ASR system,obtaining N-best hypotheses.Next,a dynamic programming-based alignment algorithm is implemented on each of these ASR hypotheses and per-frame phoneme probabilities,which are provided by a phoneme classifier jointly trained with the ASR model,to compute time stamps and confidence scores for each word in N-best hypotheses.Then,final KWS result is obtained by detecting keywords within N-best hypotheses and removing duplicated keyword occurrences according to time stamps and confident scores.Experimental results on a Vietnamese conversational telephone speech dataset show that the proposed KWS system achieves an F1 score of 77.6%,which is relatively 7.8% higher than the F1 score of the traditional HMM-based KWS system.The proposed system also provides reliable keyword confidence scores.

Find in Library

Indexed Open Access Databases

Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition

Deskripsi