SEMANTIC ANNOTATION OF SELECTED URDUIZED WORDS IN PAKISTANI ENGLISH: EVIDENCE FROM PAKLOCCORPUS

Fatima Tuz Zahra; Dr. Tehseen Zahra

doi:10.63878/jalt1688

Authors

Fatima Tuz Zahra PhD Scholar, Air University Islamabad Lecturer, Minhaj University Lahore. Author
Dr. Tehseen Zahra Associate professor Bahria University Islamabad. Author

DOI:

https://doi.org/10.63878/jalt1688

Abstract

Urduized words embedded in English discourse are a defining feature of Pakistani English. Despite their frequency and sociocultural salience, these words remain under-represented in corpus annotation frameworks and natural language processing (NLP) systems. This paper presents a corpus-driven semantic annotation framework for Urduized words, aligned with data from PakLocCorpus (2022). Using concordance evidence from the PakLocCorpus Urduized word list, the study conduct a discourse analysis of selected words using Gee (2011) and develops a context-sensitive semantic tag-set that captures culturally grounded meaning domains of descriptive labels. The analysis demonstrates that Urduized words function as semantically dense cultural carriers rather than peripheral borrowings. The paper argues that systematic semantic annotation of Urduized words is essential for inclusive corpus linguistics and for reducing structural bias in English-focused NLP technologies and descriptive labels have unique morpho-syntactic features that can be used for designing tagging tools and frameworks. Findings of the current study can be used in building pedagogical strategies for English language teaching (ELT) and second language acquisition (SLA).

SEMANTIC ANNOTATION OF SELECTED URDUIZED WORDS IN PAKISTANI ENGLISH: EVIDENCE FROM PAKLOCCORPUS

Authors

DOI:

Abstract

Downloads

Published

Issue

Section

License

jalt

HEC

Y Category HEC Recognized

Information

Language

VISITOR