Rethinking psychometrics through LLMs: how item semantics shape measurement and prediction in psychological questionnaires

Ravenda, Federico; Preti, Antonio; Poletti, Michele; Mira, Antonietta; Raballo, Andrea

doi:10.1038/s41598-025-21289-8

Psychological questionnaires are typically designed to measure latent constructs by asking respondents a series of semantically related questions. But what if these semantic relationships, rather than reflecting only the underlying construct, also impose their own structure on the data we collect? In other words, to what extent is what we “measure” in questionnaires shaped a priori by item semantics rather than revealed solely a posteriori through empirical correlations? To examine this epistemological question, we propose LLMs Psychometrics, a novel paradigm that harness LLMs to investigate how the semantic structure of questionnaire items influences psychometric outcomes. We hypothesize that the correlations among items partly mirror their linguistic similarity, such that LLMs can predict these correlations-even in the absence of empirical data. To test this, we compared actual correlation matrices from established instruments—the Big 5 Personality (Big 5) and Depression Anxiety Stress Scale (DASS-42)—with the semantic similarity structures computed by LLMs. Among the top 3 semantically similar items, the empirically most correlated item was found in 95% of DASS cases and 82% of Big 5 cases. Building on this, we developed PsychoLLM, a neural proof-of-concept architecture, which uses item semantics to predict responses to new items–demonstrated with the Generalized Anxiety Disorder-7 (GAD-7) and Patient Health Questionnaire-9 (PHQ-9). PsychoLLM achieved 70% accuracy when predicting one scale’s responses from the other, enabling new analyses based on semantic relationships. This work underscores an important epistemological implication for psychometrics: item semantics may influence measurement outcomes to varying degrees, more extensively than previously assumed. By leveraging LLMs to expose this a priori semantic structure, researchers can refine questionnaire design, assess data quality, and expand interpretive possibilities, ultimately inviting a reexamination of “what” and “how” we truly measure in psychology.