Analyzing the language of rejection: a study of user flagging responses to hate speech on Reddit

Sharon Lisseth Perez; Xiaoying Song; Lingzi Hong

doi:10.47989/ir30iConf47191

Authors

Sharon Lisseth Perez University of North Texas, United States of America
Xiaoying Song University of North Texas, United States of America
Lingzi Hong University of North Texas, United States of America

DOI:

https://doi.org/10.47989/ir30iConf47191

Keywords:

flagging messages, linguistic characteristics, natural language processing

Abstract

Introduction. Online hate speech poses significant threats to individuals and society, exacerbating psychological harm, discrimination, and potential real-world violence. While automated detection models are available, their inability to recognize subtle variations of hate speech, particularly implicit forms, emphasizes the need for supplementary methods.

Method. This study investigates the potential of user-written flagging messages in enhancing hate speech detection, focusing on the characteristics and identification of flagging messages. We created a dataset of flagging messages and the comments they respond to, employing transformer-based models (BERT, RoBERTa, ALBERT, DistilBERT, and XLNet) for classification.

Analysis. Linguistic analysis using SEANCE and Named Entity Recognition was conducted to reveal unique characteristics of flagging messages.

Results. Our findings show that BERT and DistilBERT models achieved the highest accuracy in classifying flagging messages, with distinct linguistic patterns emerging in flagging content.

Conclusions. This research contributes to the development of more nuanced hate speech detection methods by leveraging user-generated flagging content. These findings have implications for improving automated content moderation systems and supporting more inclusive online environments. Future work will focus on the effectiveness of flagging messages in identifying implicit hate speech across diverse cultural contexts.

Analyzing the language of rejection: a study of user flagging responses to hate speech on Reddit

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

About the Journal

Make a Submission

Information