Curious Cat Corpus

This page contains the Curious Cat abusive language corpus including 2,482 question-answer pairs collected from Curious Cat.

Contributors

Niloofar Safi Samghabadi
Afsheen Hatami
Mahsa Shafaei
Sudipta Kar
Thamar Solorio

Abstract

In recent years, abusive behavior has become a serious issue in online social networks. In this paper, we present a new corpus for the task of abusive language detection that is collected from a semi-anonymous online platform, and unlike the majority of other available resources, is not created based on a specific list of bad words. We also develop computational models to incorporate emotions into textual cues to improve aggression identification. We evaluate our proposed methods on a set of corpora related to the task and show promising results with respect to abusive language detection.

Here is the link to download the data: link