Abstract
The World Health Organisation (WHO) revealed approximately 280 million people in the world suffer from depression. Yet, existing studies on early-stage depression detection using machine learning (ML) techniques are limited. Prior studies have applied a single stand-alone algorithm, which is unable to deal with data complexities, prone to overfitting, and limited in generalization. To this end, our paper examined the performance of several ML algorithms for early-stage depression detection using two benchmark social media datasets (D1 and D2). More specifically, we incorporated sentiment indicators to improve our model performance. Our experimental results showed that sentence bidirectional encoder representations from transformers (SBERT) numerical vectors fitted into the stacking ensemble model achieved comparable F1 scores of 69% in the dataset (D1) and 76% in the dataset (D2). Our findings suggest that utilizing sentiment indicators as an additional feature for depression detection yields an improved model performance, and thus, we recommend the development of a depressive term corpus for future work.
Original language | English |
---|---|
Pages (from-to) | 112 |
Number of pages | 16 |
Journal | Big Data and Cognitive Computing |
Volume | 8 |
Issue number | 9 |
DOIs | |
Publication status | Published - 5 Sept 2024 |
Keywords
- depression detection
- ensemble learning
- large language models
- machine learning
- mental health
- natural language processing
- sentiment analysis