EmoMBTI-Net: introducing and leveraging a novel emoji dataset for personality profiling with large language models
Kumar, Akshi and Jain, Dipika. 2024. EmoMBTI-Net: introducing and leveraging a novel emoji dataset for personality profiling with large language models. Social Network Analysis and Mining, 14, 234. ISSN 1869-5450 [Article]
|
Text
s13278-024-01400-z.pdf - Published Version Available under License Creative Commons Attribution. Download (1MB) | Preview |
Abstract or Description
Emojis, integral to digital communication, often encapsulate complex emotional layers that enhance text beyond mere words. This research leverages the expressive power of emojis to predict Myers-Briggs Type Indicator (MBTI) personalities, diverging from conventional text-based approaches. We developed a unique dataset, EmoMBTI, by mapping emojis to specific MBTI traits using diverse posts scraped from Reddit. This dataset enabled the integration of Natural Language Processing (NLP) techniques tailored for emoji analysis. Large Language Models (LLMs) such as FlanT5, BART, and PEGASUS were trained to generate contextual linkages between text and emojis, further correlating these emojis with MBTI personalities. Following the creation of this dataset, these LLMs were applied to understand the context conveyed by emojis and were subsequently fine-tuned. Additionally, transformer models like RoBERTa, DeBERTa, and BART were specifically fine-tuned to predict MBTI personalities based on emoji mappings from MBTI dataset posts. Our methodology significantly enhances the capability of personality assessments, with the fine-tuned BART model achieving an impressive accuracy of 0.875 in predicting MBTI types, which notably exceeds the performances of RoBERTa and DeBERTa, at 0.82 and 0.84 respectively. By leveraging the nuanced communication potential of emojis, this approach not only advances personality profiling techniques but also deepens insights into digital behaviour, highlighting the substantial impact of emotive icons in online interactions.
Item Type: |
Article |
||||||
Identification Number (DOI): |
|||||||
Data Access Statement: |
No datasets were generated or analysed during the current study. |
||||||
Keywords: |
Sentiment analysis, Personality, MBTI, Emojis, LLM, Natural language understanding |
||||||
Departments, Centres and Research Units: |
|||||||
Dates: |
|
||||||
Item ID: |
37968 |
||||||
Date Deposited: |
11 Dec 2024 09:15 |
||||||
Last Modified: |
11 Dec 2024 09:15 |
||||||
Peer Reviewed: |
Yes, this version has been peer-reviewed. |
||||||
URI: |
View statistics for this item...
Edit Record (login required) |