Stade, E. C., Ungar, L., Eichstaedt, J. C., Sherman, G., & Ruscio, A. M. (2023). Depression and anxiety have distinct and overlapping language patterns: Results from a clinical interview. Journal of Psychopathology and Clinical Science, 132(8), 972–983. https://doi.org/10.1037/abn0000850
Key Takeaways
- Focus: This study explores distinct and overlapping spoken language patterns associated with clinician‑rated depression and anxiety severity during a structured diagnostic interview.
- Aims: To identify language features common to – and specific for – depression and anxiety by comparing clinician‑rated depression and generalized anxiety disorder (GAD) severity in the same sample, and to replicate findings using a self‑report measure.
- Method: In a mixed clinical sample of 486 adults, researchers conducted structured ADIS‑5L diagnostic interviews, transcribed the first “Introduction” section (~897 words per participant), extracted lexicon‑based and machine‑learning language features, and analyzed their associations with clinician‑rated depression and anxiety severity.
- Findings: I‑usage, sadness, and decreased positive emotion language showed relative specificity to depression, whereas negations and broad negative emotion terms were specific to anxiety; numerous features (e.g., perceptual words, causation) were shared. Machine‑learning models using language alone explained 14% of shared and up to 8% of disorder‑specific variance.
- Implications: Automated language‑based assessment tools must account for anxiety to improve discriminant validity when detecting depression.
Rationale
Depression detection via language analysis holds promise for unobtrusive, large‑scale assessment, yet many studies have relied on nonclinical samples and self‑report measures that conflate distress constructs (Kern et al., 2016; Coyne, 1994).
Although first‑person singular pronoun use is linked to depression and anxiety alike, it remains unclear whether such markers reflect unique or shared features (Tackman et al., 2019; Edwards & Holtzman, 2017).
Without direct comparison to anxiety, language correlates of depression risk misclassification in co‑occurring conditions (Geronimi & Woodruff‑Borden, 2015).
Therefore, the next step is to conduct rigorous, clinician‑rated investigations comparing depression and anxiety within the same sample to isolate specific linguistic signatures and improve discriminant validity.
Method
A total of 486 adults (65% female; M_age = 32.9, SD = 12.8; 56% White, 26% Black) were recruited from the Philadelphia area and screened for major depressive disorder (MDD), generalized anxiety disorder (GAD), or no psychopathology.
Participants completed the ADIS-5L structured diagnostic interview; the first “Introduction” section (~897 words) was transcribed verbatim.
Linguistic features were extracted using LIWC lexica (e.g., pronouns, emotion words), the NRC emotion lexicon, and novel machine-learning–derived predictors. Severity ratings for MDD and GAD (0–8 scale) served as outcome variables.
Analyses included OLS regressions controlling age and sex, partial correlations for specificity, PCA to derive a shared distress factor, false discovery rate correction, and elastic net regression with 10-fold cross-validation to predict shared and disorder-specific variance from language features.
Results
- Shared distress: People with higher depression or anxiety both tended to use more “I” statements and more negative-feeling words.
- Depression only: Once you account for anxiety, those higher in depression used noticeably fewer positive words.
- Anxiety only: After accounting for depression, the expected worry or fear words didn’t stand out—but people with more anxiety did use fewer negations (words like “not”).
- Sadness vs. negations: Sadness-related words flagged depression specifically, while frequent use of “not” (negations) flagged anxiety.
- Predictive power: A computer model looking only at how people spoke could explain about 14% of general distress, plus smaller amounts (5%–8%) of depression-only or anxiety-only severity.
Insight
- Depression’s language signature reflects heightened self‑focus and sadness expression, consistent with self‑immersed perspective theories, whereas anxiety language reflects problem articulation and broad negative affect without increased self‑reference.
- Shared markers (e.g., perceptual, causation words) underscore a general distress profile cutting across disorders.
- Novel features suggest that somatic concerns (ingestion terms) and lexical simplicity (shorter words) may be unique to depression, while reduced negations and increased interrogatives index anxious problem‑focus.
- These findings extend prior work by demonstrating the need for transdiagnostic controls in computational linguistics studies and pave the way for refining machine‑learning models with better discriminant validity.
Clinical Implications
Automated analysis of patient speech can augment traditional assessments by flagging overall distress and distinguishing depression from anxiety in real time.
Clinicians could integrate brief, open-ended interview prompts into intake workflows, then use language-processing tools to highlight low positive-emotion language (suggesting depression) or reduced negations (suggesting anxiety).
This approach may streamline case identification, support triage decisions, and tailor intervention focus – encouraging more positive framing for depressed clients or addressing uncertainty and avoidance in anxious clients.
Policymakers and clinic managers should ensure tool validation across diverse populations and invest in secure, privacy-preserving platforms.
Challenges include integrating new technology with existing electronic health records, ensuring clinician training on interpretation, and maintaining client consent and data security.
Socratic Questions
- How might the problem‑focused interview context have influenced language patterns, and would findings replicate with unprompted speech?
- Could shared language markers reflect overlap in symptom severity rather than true transdiagnostic constructs?
- In what ways could computational models control for comorbid conditions beyond anxiety to improve specificity?
- How might cultural or linguistic background moderate these language–psychopathology associations?
- What ethical considerations arise when deploying automated language‑based assessment in clinical and digital health platforms?
References
Stade, E. C., Ungar, L., Eichstaedt, J. C., Sherman, G., & Ruscio, A. M. (2023). Depression and anxiety have distinct and overlapping language patterns: Results from a clinical interview. Journal of Psychopathology and Clinical Science, 132(8), 972–983. https://doi.org/10.1037/abn0000850
Coyne, J. C. (1994). Self-reported symptoms of depression and anxiety: Distinct constructs or overlapping distress? Annual Review of Psychology, 45, 29–60.
Edwards, D., & Holtzman, N. S. (2017). Linguistic markers of depression and anxiety: A meta‐analysis of first‐person singular pronoun use. Psychological Bulletin, 143(6), 612–634.
Geronimi, E. M., & Woodruff‐Borden, J. (2015). Self‐focused attention in comorbid anxiety and depression: Implications for assessment and treatment. Journal of Clinical Psychology, 71(5), 504–515.
Kern, M. L., Eichstaedt, J. C., Schwartz, H. A., Park, G., Sap, M., Smith, L. K., & Ungar, L. H. (2016). Detecting psychological distress from social media language: Foundations for unobtrusive large-scale assessment. Journal of Language and Social Psychology, 35(3), 302–330.
Tackman, A. M., Sbarra, D. A., Carey, A. L., Donnellan, M. B., Horn, A. B., Holtzman, N. S., … Mehl, M. R. (2019). Depression, negative emotionality, and self-referential language: A multi-lab, multi-measure, multi-language-task synthesis. Journal of Personality and Social Psychology, 117(4), 757–776.