Hate Speech Detection in Indonesian Twitter using Contextual Embedding Approach

oleh: Guntur Budi Herwanto, Annisa Maulida Ningtyas, I Gede Mujiyatna, Kurniawan Eka Nugraha, I Nyoman Prayana Trisna

Format: Article
Diterbitkan: Universitas Gadjah Mada 2021-04-01

Deskripsi

Hate speech develops along with the rapid development of social media. Hate speech is often issued due to a lack of public awareness of the difference between criticism and statements that might contribute to this crime. Therefore, it is very important to do early detection of sentences that will be written before causing a criminal act due to public ignorance. In this paper, we use the advancement of deep neural networks to predict whether a sentence contains a hate speech and abusive tone. We demonstrate the robustness of different word and contextual embedding to represent the semantic of hate speech words. In addition, we use a document embedding representation via a recurrent neural networks with gated recurrent unit as the main architecture to provide richer representation. Compared to syntactic representation of the previous approach, the contextual embedding in our model proved to give a significant boost on the performance by a significant margin.