A CORPUS-BASED COMPARATIVE ANALYSIS OF LEXICAL DIVERSITY OUTCOMES IN THE ENGLISH WRITING OF PUBLIC AND PRIVATE SECONDARY SCHOOL STUDENTS IN GUJRANWALA DISTRICT, PAKISTAN

Authors

  • Syed Sajid Ali Shah PhD Scholar, Department of Linguistics, University of South Asia, Lahore, Pakistan. Author
  • Muhammad Liaqat PhD Scholar in English Linguistics, Department of English, University of South Asia, Lahore, Pakistan. Author
  • Muhammad Tanveer Aslam PhD English Linguistics, Visiting Lecturer, UOL Sargodha Campus, Pakistan Author
  • Choudhry Shahid (Corresponding Author) Professor, Department of English, University of South Asia, Lahore, Pakistan Author

DOI:

https://doi.org/10.63878/jalt2214

Keywords:

Corpus Linguistics; Lexical Diversity; Type-Token Ratio (TTR); Biber’s Multidimensional Framework; Stratified Educational Divide; ESL Writing in Pakistan.

Abstract

This study investigated the systemic disparities in productive English writing between public and private secondary school students in Gujranwala District, Pakistan, by empirically evaluating their lexical diversity outcomes. In Pakistan’s stratified educational ecosystem, socio-economic divides drive structural imbalances between resource-constrained, public schools relying on the Grammar-Translation Method (GTM) and English-medium private schools utilizing communicative approaches. This study explores how these contrasting configurations shape active vocabulary deployment. The Gujranwala Secondary School Learner Corpus, which is a collection of the writings from students, was put together from a group of 100 secondary students. These students were in grades 9 and 10. The group was made to be fair and equal. It had 50 students from schools and 50 students from private schools. The group also had several boys and girls. The students had to write a story for 45 minutes. This was a test to see how well they could write. The Gujranwala Secondary School Learner Corpus is a useful tool. The Gujranwala Secondary School Learner Corpus has a lot of information about how students write. The linguistic analysis was grounded in Biber’s Multidimensional Analysis Framework (Dimension 1: Involved vs. Informational Production) alongside Oxford’s Strategic Self-Regulation (S2R) Model to evaluate vocabulary richness and communicative automaticity. Textual data logs were processed via AntConc (v4.2.4) and statistically verified using an Independent Samples t-test. The private school cohort demonstrated significantly greater lexical automaticity, generating a total volume of 11,450 running tokens (\mu = 229.0 words/essay) compared to the public cohort's 6,120 tokens (\mu = 122.4 words/essay). Surface lexical diversity measured via mean Type-Token Ratio (TTR) was significantly higher in private texts (M = 0.61, SD = 0.05) than in public texts (M = 0.43, SD = 0.07), t(98) = 14.83, p < .001, representing an exceptionally large institutional effect size (Cohen’s d = 2.96). Private essays featured a higher part-of-speech content-word density (54.8\%) compared to public sector essays (39.2\%), which relied heavily on repeated structural function words (60.8\%) and static, memorized examination templates. Private schooling yields substantially superior productive vocabulary richness, whereas public sector constraints encourage a restricted mental lexicon and repetitive structural avoidance. Future research should pursue multi-district longitudinal lexical tracking using advanced natural language processing (NLP) architectures to monitor structural vocabulary development across regional domains.

Downloads

Published

2026-03-23