Towards Non-Toxic Landscapes: Automatic Toxic Comment Detection Using DNN

Ashwin Geet d'Sa; Irina Illina; Dominique Fohr

Communication Dans Un Congrès Année : 2020

Towards Non-Toxic Landscapes: Automatic Toxic Comment Detection Using DNN

(1) , (1) , (1)

Ashwin Geet d'Sa

Fonction : Auteur

Speech Modeling for Facilitating Oral-Based Communication

Irina Illina

Fonction : Auteur
PersonId : 15663
IdHAL : irina-illina
IdRef : 120731746

Speech Modeling for Facilitating Oral-Based Communication

Dominique Fohr

Fonction : Auteur
PersonId : 15652
IdHAL : dominique-fohr
IdRef : 031092942

Speech Modeling for Facilitating Oral-Based Communication

Résumé

The spectacular expansion of the Internet has led to the development of a new research problem in the field of natural language processing: automatic toxic comment detection, since many countries prohibit hate speech in public media. There is no clear and formal definition of hate, offensive, toxic and abusive speeches. In this article, we put all these terms under the umbrella of "toxic speech". The contribution of this paper is the design of binary classification and regression-based approaches aiming to predict whether a comment is toxic or not. We compare different unsupervised word representations and different DNN based classifiers. Moreover, we study the robustness of the proposed approaches to adversarial attacks by adding one (healthy or toxic) word. We evaluate the proposed methodology on the English Wikipedia Detox corpus. Our experiments show that using BERT fine-tuning outperforms feature-based BERT, Mikolov's and fastText representations with different DNN classifiers.

Mots clés

hate speech detection word embeddings deep neural networks

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

trac1_LREC_v8.pdf (162.03 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Ashwin Geet D'Sa : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02530879

Soumis le : vendredi 3 avril 2020-11:41:30

Dernière modification le : lundi 11 septembre 2023-17:41:19

Dates et versions

hal-02530879 , version 1 (03-04-2020)

Identifiants

HAL Id : hal-02530879 , version 1

Citer

Ashwin Geet d'Sa, Irina Illina, Dominique Fohr. Towards Non-Toxic Landscapes: Automatic Toxic Comment Detection Using DNN. TRAC-2020, Second Workshop on Trolling, Aggression and Cyberbullying (LREC, 2020), May 2020, Marseille, France. ⟨hal-02530879⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA GRID5000 UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD IMPACT-OLKI SILECS

1093 Consultations

7058 Téléchargements

Towards Non-Toxic Landscapes: Automatic Toxic Comment Detection Using DNN

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager