Beyond Traditional Classifiers: Evaluating Large Language Models for Robust Hate Speech Detection

Tools

Barakat, Basel and Jaf, Sardar. 2025. Beyond Traditional Classifiers: Evaluating Large Language Models for Robust Hate Speech Detection. Computation, 13(8), 196. ISSN 2079-3197 [Article]

Preview

Text
computation-13-00196-v2-1.pdf - Published Version
Available under License Creative Commons Attribution.
Download (309kB) | Preview

Official URL: https://www.mdpi.com/2079-3197/13/8/196

Abstract or Description

Hate speech detection remains a significant challenge due to the nuanced and context-dependent nature of hateful language. Traditional classifiers, trained on specialized corpora, often struggle to accurately identify subtle or manipulated hate speech. This paper explores the potential of utilizing large language models (LLMs) to address these limitations. By leveraging their extensive training on diverse texts, LLMs demonstrate a superior ability to understand context, which is crucial for effective hate speech detection. We conduct a comprehensive evaluation of various LLMs on both binary and multi-label hate speech datasets to assess their performance. Our findings aim to clarify the extent to which LLMs can enhance hate speech classification accuracy, particularly in complex and challenging cases.

Item Type:

Article

Identification Number (DOI):

https://doi.org/10.3390/computation13080196

Keywords:

hate speech detection; large language models (LLMs); context understanding; binary hate speech datasets; multi-label hate speech datasets; classification accuracy

Departments, Centres and Research Units:

Computing

Dates: