SENS-HEAD: A Machine Learning Framework for Sensationalism Detection in News Headlines Using Linguistic and Semantic Features

Chang, Po-Hsuan; Kumar, Akshi and Sangwan, Saurabh Raj. 2025. SENS-HEAD: A Machine Learning Framework for Sensationalism Detection in News Headlines Using Linguistic and Semantic Features. British Journal of Multidisciplinary and Advanced Studies, 6(3), ISSN 2517-276X [Article]

[img] Text
SENS-HEAD.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB)

Abstract or Description

The proliferation of sensationalized news headlines has raised concerns about media integrity, necessitating automated approaches for detecting sensationalism beyond traditional clickbait classification. This study presents SENS-HEAD, a novel dataset comprising over 30,000 annotated headlines labelled for sensational content and emotional arousal. Employing Natural Language Processing (NLP), we extract a diverse set of linguistic and semantic features, including sentiment polarity, syntactic complexity, punctuation distribution, and stop word ratio, to systematically distinguish sensational from non-sensational headlines. We implement ensemble learning models—XGBoost, CATBoost, and Random Forest achieving a balanced F1-score of 0.66. To enhance interpretability, we integrate SHAP (SHapley Additive exPlanations), unveiling key predictive markers such as stop word frequency, headline length, and sentiment extremity. The findings not only advance explainable AI (XAI) for sensationalism detection but also provide practical applications in automated journalism, content moderation, and media ethics regulation. By strengthening computational linguistics with ethical AI, this research delivers actionable insights for policymakers and promotes trustworthy news dissemination in the digital era.

Item Type:

Article

Identification Number (DOI):

https://doi.org/10.37745/bjmas.2022.04909

Data Access Statement:

A dataset will be made available on request

Keywords:

Sensationalism detection, linguistic features, NLP, machine learning, news headlines, XAI

Departments, Centres and Research Units:

Computing

Dates:

DateEvent
23 May 2025Accepted
1 June 2025Published

Item ID:

38935

Date Deposited:

03 Jun 2025 08:44

Last Modified:

03 Jun 2025 08:52

Peer Reviewed:

Yes, this version has been peer-reviewed.

URI:

https://research.gold.ac.uk/id/eprint/38935

View statistics for this item...

Edit Record Edit Record (login required)