DISTRIBUTIONAL SEMANTICS AND TAGSET OF NOUNS TAXONOMIES: A CORPUS-DRIVEN STUDY OF SARAIKI FOLK SONGS
DOI:
https://doi.org/10.63878/jalt951Keywords:
Corpus-Driven approach, Noun Taxonomies, Semantic Networks, Saraiki Folk Songs, Word Formation.Abstract
This study investigates the distribution of semantic networks of nouns in Saraiki folk songs through a corpus-driven approach, utilizing 3A Model for annotation and analysis. Saraiki, spoken by nearly 20 million people in Pakistan, has limited digital resources, with the iJunoonSaraiki dictionary (2017) being one of the few available corpora. To address the need for a structured lexical resource, this study aims to uncover the underlying semantic relationships among Saraiki nouns, which could contribute to the development of a Saraiki WordNet. A specialized corpus of 0.25 million words was compiled from the folk songs, and nouns were tagged, categorized, and manually annotated using UAM CorpusTool. Unlike traditional approaches, a discourse analysis framework was employed to examine the contextual use of nouns and identify recurring patterns in semantic relations. Specifically, relational semantics was used to analyze connections such as hyponymy, meronymy, and part-whole relations. The findings revealed that meronymic relationships were the most frequent, while antonymy was the least common occurrence. This study highlights the importance of corpus-driven methods for understanding the semantic structures of under-resourced languages and suggests that insights gained from the Saraiki noun relations could be central to creating digital tools for Natural Language Processing (NLP) and computational linguistics applications.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.