Suicidal ideation in adolescents is a critical public health issue requiring early detection. This study examined whether machine learning (ML) and large language models (LLMs) can detect ideation in 1,197 students (ages 10–15) using self-reported Strengths and Difficulties Questionnaire (SDQ) data. Clinically relevant ideation was defined using Suicidal Ideation Questionnaire—Junior (SIQ-JR) cut-offs. Gemini 1.5 Pro and GPT-4o were prompted to estimate SIQ-JR scores from SDQ responses and demographics; Logistic Regression, Naive Bayes, and Random Forest models were trained on either SDQ data or LLM predictions. LLM predictions correlated with SIQ-JR (ρ = .61) and showed good discrimination across thresholds (area under the curve (AUC) ≥ .83), with item-level associations paralleling self-reports, revealing strong associations with emotional symptoms and peer problems. In cross-validated analyses, the best SDQ-based ML model reached sensitivity = .85 and specificity = .72; the best LLM-based model achieved .80 and .74. Notably, ML models trained directly on SDQ responses consistently outperformed those incorporating LLM predictions across all SIQ-JR thresholds. Nonetheless, LLMs demonstrated promising accuracy in identifying suicidal ideation based on SDQ and demographic data. Further refinement and validation are required before these approaches can be considered viable for clinical implementation.
Detecting Suicidal Ideation in Adolescence Using Self-Reported Emotional and Behavioral Patterns: Comparing Machine Learning and Large Language Model Predictions
Marengo, DavideFirst
;Longobardi, Claudio
Last
2025-01-01
Abstract
Suicidal ideation in adolescents is a critical public health issue requiring early detection. This study examined whether machine learning (ML) and large language models (LLMs) can detect ideation in 1,197 students (ages 10–15) using self-reported Strengths and Difficulties Questionnaire (SDQ) data. Clinically relevant ideation was defined using Suicidal Ideation Questionnaire—Junior (SIQ-JR) cut-offs. Gemini 1.5 Pro and GPT-4o were prompted to estimate SIQ-JR scores from SDQ responses and demographics; Logistic Regression, Naive Bayes, and Random Forest models were trained on either SDQ data or LLM predictions. LLM predictions correlated with SIQ-JR (ρ = .61) and showed good discrimination across thresholds (area under the curve (AUC) ≥ .83), with item-level associations paralleling self-reports, revealing strong associations with emotional symptoms and peer problems. In cross-validated analyses, the best SDQ-based ML model reached sensitivity = .85 and specificity = .72; the best LLM-based model achieved .80 and .74. Notably, ML models trained directly on SDQ responses consistently outperformed those incorporating LLM predictions across all SIQ-JR thresholds. Nonetheless, LLMs demonstrated promising accuracy in identifying suicidal ideation based on SDQ and demographic data. Further refinement and validation are required before these approaches can be considered viable for clinical implementation.| File | Dimensione | Formato | |
|---|---|---|---|
|
accepted_manuscript.pdf
Accesso aperto
Tipo di file:
POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione
777.74 kB
Formato
Adobe PDF
|
777.74 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



