A CROSS-DISCIPLINARY CORPUS ANALYSIS OF MORPHOSYNTACTIC PATTERNS IN UNDERGRADUATE RESEARCH PAPERS

Authors

  • Syed Muhammad Aziz Ali Naqvi,Mohammad Aafaq Nadeem,Naila Aslam Author

DOI:

https://doi.org/10.63878/jalt1632

Abstract

This paper aims to investigate morphosyntactics of undergraduate research papers using a cross-disciplinary corpus research to indicate how these linguistic features represent disciplinary ideals and language development in the novice scholarly writing. The morphosyntactic constructions nominalizations, passive structures, verb usage, and subordination of clauses-are needed in building an objectivity, abstraction and rhetorical persuasion of scholarly writing. What matters most in their analysis is their undergraduate level, they have left the general writing practices behind and now adopt discipline writing practices, which as we all know, can be quite difficult, since most students will have to work in different epistemological requirements in fields. In this research, three research questions are in focus, namely: (1) What are the most frequently occurring morphosyntactic peculiarities of undergraduate research papers? (2) What are the differences in these features in different disciplines, especially in humanities/social sciences and STEM? (3) What do these patterns imply in regards to teaching academic writing? In order to create a balanced sample of about 500,000 words 100 anonymized undergraduate research papers (20 humanities, 20 social sciences, 20 natural sciences, 20 engineering, 20 business) were collected through open-access university archives with complete ethical clearance. Preprocessing (tokenization and lemmatization) through spaCy and morphosyntactic annotation through the Universal Dependencies framework through Stanza were used to work with texts. Findings show moderate general usage of passive voice (32.4 per 1,000 words) and nominalizations (45.6 per 1,000 words) and a dominance of present simple tense (52-percent of verbs). Large differences in disciplines were obtained: STEM subjects had more passive constructions (maximum 42.1) and more nominalization density (maximum 55.3) to make their content impersonal and compress information, whereas humanities and social sciences used more complex subordination (maximum 22.4 per 1,000 words) to make their work argumentative. In addition to extending corpus linguistics by testing the application of the register variation theory to novice writing, and offering practical implications to discipline-specific English to Academic Purposes pedagogy, they recommend specific training on morphosyntactic features to enable them to develop their rhetoric and interdisciplinary literacy.

Downloads

Published

2025-12-30