Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
SE‐Former: Incorporating sentence embeddings into Transformer for low‐resource NMT
oleh: Dongsheng Wang, Shaoyong Wang
Format: | Article |
---|---|
Diterbitkan: | Wiley 2023-06-01 |
Deskripsi
Abstract Recently, pre‐trained language models (PLM), such as Roberta and SimCSE, demonstrate strengths in many natural language understanding (NLU) tasks. However, there are few works on applying PLM to neural machine translation (NMT). Motivated by alleviating the data scarcity issue of low‐resource NMT, here incorporating sentence embeddings from SimCSE into the Transformer network is explored and SE‐Former model is proposed. In this model an embed‐fusion module is designed to utilize the output of SimCSE for NMT. Specifically, the outputs of encoder and SimCSE are fed into embed‐fusion module, attention network learns the relationship of sentence embedding and corresponding word embeddings. After addition, concatenation and linear transformation operations, the tensor fused with sentence embedding is obtained. The size of tensor output by embed‐fusion module is the same as original encoder. Finally, embed‐fusion module is connected to Transformer decoder. On IWSLT En‐Es, En‐Zh and En‐Fr tasks, SE‐Former obtains 42.13, 29.32 and 39.21 bilingual evaluation understudy (BLEU) points, respectively. Experimental results show the superiority of this method.