SE‐Former: Incorporating sentence embeddings into Transformer for low‐resource NMT

oleh: Dongsheng Wang, Shaoyong Wang

Format:	Article
Diterbitkan:	Wiley 2023-06-01

Deskripsi

Abstract Recently, pre‐trained language models (PLM), such as Roberta and SimCSE, demonstrate strengths in many natural language understanding (NLU) tasks. However, there are few works on applying PLM to neural machine translation (NMT). Motivated by alleviating the data scarcity issue of low‐resource NMT, here incorporating sentence embeddings from SimCSE into the Transformer network is explored and SE‐Former model is proposed. In this model an embed‐fusion module is designed to utilize the output of SimCSE for NMT. Specifically, the outputs of encoder and SimCSE are fed into embed‐fusion module, attention network learns the relationship of sentence embedding and corresponding word embeddings. After addition, concatenation and linear transformation operations, the tensor fused with sentence embedding is obtained. The size of tensor output by embed‐fusion module is the same as original encoder. Finally, embed‐fusion module is connected to Transformer decoder. On IWSLT En‐Es, En‐Zh and En‐Fr tasks, SE‐Former obtains 42.13, 29.32 and 39.21 bilingual evaluation understudy (BLEU) points, respectively. Experimental results show the superiority of this method.

Find in Library

Indexed Open Access Databases

SE‐Former: Incorporating sentence embeddings into Transformer for low‐resource NMT

Deskripsi