LANGUAGE OF STRESS: A CORPUS-BASED STUDY TO DETECT EARLY SIGNS OF SUICIDE THROUGH LEXICAL CHOICE

Authors

  • Afshan Ishfaq Assistant Professor, Head of Academics at Institute of Law, Lahore Author
  • Nida Sultan Lecturer in English at Namal University, Mianwali. Author

DOI:

https://doi.org/10.63878/jalt2310

Keywords:

corpus linguistics, suicide prevention, lexical analysis, language of stress, computational stylistics, mental health discourse, psycholinguistics, natural language processing, Keynes, LIWC.

Abstract

Suicide represents one of the most devastating and preventable causes of premature death globally, yet its early detection remains stubbornly elusive. This study advances the hypothesis that language specifically the spontaneous lexical choices individuals make in everyday written and digital discourse constitutes one of the most sensitive and accessible markers of suicidal ideation. We employ corpus-based methodologies to conduct a systematic, quantitative investigation of how the written language of individuals experiencing suicidal ideation differs from that of a matched non-suicidal population. A purpose-built Suicide Discourse Corpus (SDC) of approximately 452,000 tokens was compiled from four heterogeneous sources: anonymized crisis helpline transcripts, Reddit posts from mental health disclosure communities, published first-person narratives of suicidal crises, and archival farewell notes. A Matched Control Corpus (MCC) of 449,800 tokens from general online discourse was constructed as a baseline. Analytical methods include keyness analysis (log-likelihood, G²), semantic domain profiling using the UCREL Semantic Analysis System (USAS), collocational analysis (Mutual Information scoring), and frequency analysis of grammatical and functional-word categories. Findings reveal a statistically robust lexical signature in suicidal discourse marked by: (a) dramatically elevated pain, suffering, and death-related vocabulary; (b) absolutist and negation-heavy language reflecting cognitive constriction; (c) a depletion of future-oriented temporal reference and positive evaluative terms; (d) heightened first-person singular pronoun use alongside reduced social solidarity vocabulary; and (e) distinctive collocational frames encoding inescapable, internally directed suffering. These patterns align with major psychological theories of suicide including Shneidman's psychache theory, Joiner's Interpersonal Theory of Suicide (IPTS), and Beck's cognitive model of hopelessness. Implications for the design of NLP-assisted early-warning systems, ethical governance of mental health corpus research, and future multilingual extension of this work are discussed at length.

References

• Al-Mosaiwi, M., & Johnstone, T. (2018). In an absolute state: Elevated use of absolutist words is a marker specific to anxiety, depression, and suicidal ideation. Clinical Psychological Science, 6(4), 529–542.

• Baumeister, R. F. (1990). Suicide as escape from self. Psychological Review, 97(1), 90–113.

• Beck, A. T. (1963). Thinking and depression: I. Idiosyncratic content and cognitive distortions. Archives of General Psychiatry, 9(4), 324–333.

• Beck, A. T. (1967). Depression: Clinical, experimental, and theoretical aspects. Harper & Row.

• Beck, A. T., Brown, G., & Steer, R. A. (1989). Prediction of eventual suicide in psychiatric inpatients by clinical ratings of hopelessness. Journal of Consulting and Clinical Psychology, 57(2), 309–310.

• Beck, A. T., Weissman, A., Lester, D., & Trexler, L. (1974). The measurement of pessimism: The Hopelessness Scale. Journal of Consulting and Clinical Psychology, 42(6), 861–865.

• Boroditsky, L. (2011). How language shapes thought. Scientific American, 304(2), 62–65.

• Brown, G. K., Beck, A. T., Steer, R. A., & Grisham, J. R. (2000). Risk factors for suicide in psychiatric outpatients: A 20-year prospective study. Journal of Consulting and Clinical Psychology, 68(3), 371–377.

• Chu, C., Buchman-Schmitt, J. M., Stanley, I. H., Hom, M. A., Tucker, R. P., Hagan, C. R., ... & Joiner, T. E. (2017). The interpersonal theory of suicide: A systematic review and meta-analysis of a decade of cross-national research. Psychological Bulletin, 143(12), 1313–1345.

• Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.

• Ishfaq, A., & Bhatti, A. M. (2019). Language shift and imminent language death: A diachronic study of Dawoodi. Harf-o-Sukhan, 3, 1–14.

• Ishfaq, A., & Bhatti, A. M. (2020). Lexical attrition and generational language competence in Dawoodi speakers. Harf-o-Sukhan, 4, 4–18.

• Ishfaq, A., & Bhatti, A. M. (2021). From pidgin to creole to collapse: The evolutionary trajectory of Dawoodi language. Jahan-e-Tahqeeq, 4.

• Ishfaq, A., Sultan, N., Hassan, N., Aleem, F., & Maldonado, M. G. (2022). Elif Shafak's Forty Rules of Love: Contextual variation in adjectives. International Online Journal of Language and Literature.

• Ishfaq, A., & Bhatti, A. M. (2022). Linguistic hegemony and the silencing of Dawoodi: Power, stigma, and structural marginalization. Jahan-e-Tahqeeq, 5(4), 48–60.

• Ishfaq, A., & Bhatti, A. M. (2023). Code-switching, borrowing, and linguistic dilution: Contact-induced change in Dawoodi. Jahan-e-Tahqeeq, 6(3), 577–592.

• Ishfaq, A., & Sultan, N. (2024). Identification of different methodologies for treatment of autism in Urdu-speaking adolescents: An investigative report. Contemporary Journal of Social Science Review, 2(4), 1611–1618.

• Ishfaq, A., & Bhatti, A. M. (2024). From heritage to liability: Language attitudes and identity reconstruction as drivers of obsolescence in the Dawoodi language. Al-Mahdi Research Journal (MRJ), 5(3), 1303–1336.

• Ishfaq, A., Malik, A. H., & Sultan, N. (2025). Developing trauma-sensitive pedagogical practices for resilient learning in academia: A multidisciplinary approach of psycholinguistics and ELT. Al Aasar, 2(1), 171–189.

• Ishfaq, A., Sultan, N., & Healy, B. (2025). Turn-taking, politeness, and identity: A conversational study of Speak Your Heart. Journal of Applied Linguistics and TESOL (JALT), 8(3), 1567–1581.

• Ishfaq, A., Azim, M. U. (2025). Phono-semantics and translation: A cross-linguistic study of Urdu and Punjabi ideophones. International Research Journal of Arts, Humanities and Social Sciences, 2(3).

• Ishfaq, A., Ahmad, S., & Sultan, N. (2025). Decoding despair: A multidisciplinary psycho-forensic linguistic approach to suicide notes. Journal of Psychology, Health and Social Challenges, 3(2), 83–89.

• Ishfaq, A., & Sultan, N. (2025). Narrative and meaning in Surah Yūsuf: A critical hermeneutic analysis. AL-HAYAT Research Journal (AHRJ), 2(4), 11–23.

• Ishfaq, A., & Sultan, N. (2025). Trauma, resilience, and narrative healing: A psycho-hermeneutic reading of Surah Yūsuf. AL-JAMEI Research Journal, 3(1), 229–239.

• Ishfaq, A. (2025). Ethnolinguistic identity and cultural memory in the Dawoodi community. Annual Methodological Archive Research Review, 3(6), 185–208.

• Khan, I. A., Khaled, F., & Ishfaq, A. (2025). Operation Bunyan un Marsoos: A critical analysis of human rights compliance—A study of the operation's adherence to human rights law and international humanitarian law. Dialogue Social Science Review (DSSR), 3(6), 173–184.

• Ishfaq, A., & Sultan, N. (2025). Cognitive control and executive function in advanced second-language writing. Sareer-a-Khama, 4(4).

• Sultan, N., & Ishfaq, A. (2026). Instagram captions as identity performance: A multimodal discourse analysis. AL-HAYAT Research Journal (AHRJ), 3(2), 9–21.

• Ishfaq, A., & Sultan, N. (2026). Modeling meaning in generative AI: A corpus-assisted discourse analysis of coherence, framing, and persuasion. Pakistan Journal of Social Science Review, 5(3), 1147–1164.

• Ahmad, A. I. S. (2026). Language of influence: A corpus-based lexical analysis of psychological power strategies in Robert Greene's The Laws of Human Nature. In Proceedings of the 2nd Riphah International Conference on Language, Literature, and Culture.

• Ishfaq, A. (2026). Thinking through language: Linguistic foundation and advanced academic writing

• Coppersmith, G., Dredze, M., & Harman, C. (2014). Quantifying mental health signals in Twitter. Proceedings of the Workshop on Computational Linguistics and Clinical Psychology, 51–60.

• Coppersmith, G., Dredze, M., Harman, C., & Hollingshead, K. (2015). From ADHD to SAD: Analyzing the language of mental health on Twitter through self-reported diagnoses. Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology.

• Fawcett, J., Scheftner, W. A., Fogg, L., Clark, D. C., Young, M. A., Hedeker, D., & Gibbons, R. (1990). Time-related predictors of suicide in major affective disorder. American Journal of Psychiatry, 147(9), 1189–1194.

• Firth, J. R. (1957). A synopsis of linguistic theory 1930–1955. Studies in Linguistic Analysis, 1–32. Blackwell.

• Fussell, S. R. (Ed.). (2002). The verbal communication of emotions: Interdisciplinary perspectives. Lawrence Erlbaum Associates.

• Gabrielatos, C., & Marchi, A. (2012). Keyness: Appropriate metrics and practical issues. Presented at the CADS International Conference, University of Bologna.

• Gaur, M., Alambo, A., Sain, J. P., Kursuncu, U., Thirunarayan, K., Kavuluru, R., ... & Sheth, A. (2019). Knowledge-aware assessment of severity of suicide risk for early intervention. Proceedings of The Web Conference 2019.

• Gkotsis, G., Oellrich, A., Hubbard, T., Dobson, R., Liakata, M., Velupillai, S., & Dutta, R. (2017). Characterisation of mental health conditions in social media using informed deep learning. Scientific Reports, 7, 45141.

• Hunston, S. (2002). Corpora in applied linguistics. Cambridge University Press.

• Ingram, R. E. (1990). Self-focused attention in clinical disorders: Review and a conceptual model. Psychological Bulletin, 107(2), 156–176.

• Ji, S., Pan, S., Li, X., Cambria, E., Long, G., & Huang, Z. (2021). Suicidal ideation detection: A review of machine learning methods and applications. IEEE Transactions on Computational Social Systems, 8(1), 214–226.

• Joiner, T. E. (2005). Why people die by suicide. Harvard University Press.

• Kövecses, Z. (2000). Metaphor and emotion: Language, culture, and body in human feeling. Cambridge University Press.

• Leenaars, A. A. (1988). Suicide notes. Human Sciences Press.

• Luoma, J. B., Martin, C. E., & Pearson, J. L. (2002). Contact with mental health and primary care providers before suicide: A review of the evidence. American Journal of Psychiatry, 159(6), 909–916.

• Mor, N., & Winquist, J. (2002). Self-focused attention and negative affect: A meta-analysis. Psychological Bulletin, 128(4), 638–662.

• O'Connor, R. C. (2011). Towards an integrated motivational–volitional model of suicidal behaviour. In R. C. O'Connor, S. Platt, & J. Gordon (Eds.), International handbook of suicide prevention (pp. 181–198). Wiley-Blackwell.

• O'Connor, R. C., & Kirtley, O. J. (2018). The integrated motivational–volitional model of suicidal behaviour. Philosophical Transactions of the Royal Society B, 373(1754).

• Pennebaker, J. W., Mehl, M. R., & Niederhoffer, K. G. (2003). Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology, 54(1), 547–577.

• Pestian, J. P., Matykiewicz, P., Linn-Gust, M., South, B., Uzuner, O., Wiebe, J., ... & Brew, C. (2012). Sentiment analysis of suicide notes: A shared task. Biomedical Informatics Insights, 5, 3–16.

• Pirkis, J., & Burgess, P. (1998). Suicide and recency of health care contacts. British Journal of Psychiatry, 173(6), 462–474.

• Priel, B., Mitrany, D., & Shahar, G. (1998). Closeness, support and reciprocity: A study of attachment styles in adolescence. Personality and Individual Differences, 25(6), 1183–1197.

• Rayson, P. (2008). Wmatrix: A web-based corpus processing environment. Lancaster University.

• Rayson, P., Archer, D., Piao, S., & McEnery, T. (2004). The UCREL Semantic Analysis System. Proceedings of the LREC Workshop on Beyond Named Entity Recognition.

• Scott, M. (1997). PC analysis of key words – and key key words. System, 25(2), 233–245.

• Shing, H. C., Nair, S., Zirikly, A., Friedenberg, M., Daumé III, H., & Resnik, P. (2018). Expert, crowdsourced, and machine assessment of suicide risk via online postings. Proceedings of the 5th Workshop on Computational Linguistics and Clinical Psychology.

• Shneidman, E. S. (1993). Suicide as psychache: A clinical approach to self-destructive behavior. Jason Aronson.

• Shneidman, E. S. (1996). The suicidal mind. Oxford University Press.

• Sinclair, J. (1991). Corpus, concordance, collocation. Oxford University Press.

• Stirman, S. W., & Pennebaker, J. W. (2001). Word use in the poetry of suicidal and nonsuicidal poets. Psychosomatic Medicine, 63(4), 517–522.

• Tay, D. (2012). Applying the notion of metaphor types to compare counseling and everyday talk. Journal of Counseling & Development, 90(3), 347–356.

• Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24–54.

• Van Orden, K. A., Witte, T. K., Cukrowicz, K. C., Braithwaite, S. R., Selby, E. A., & Joiner, T. E. (2010). The interpersonal theory of suicide. Psychological Review, 117(2), 575–600.

• World Health Organization. (2023). Suicide worldwide in 2019: Global health estimates. WHO Press.

• Zirikly, A., Resnik, P., Uzuner, O., & Hollingshead, K. (2019). CLPsych 2019 shared task: Predicting the degree of suicide risk in Reddit posts. Proceedings of the 6th Workshop on Computational Linguistics and Clinical Psychology.

Downloads

Published

2026-03-24