"ROMAN URDU AND CODE-MIXED LANGUAGE PROCESSING FOR SOCIAL MEDIA ANALYTICS IN PAKISTAN."

Sabeen Amjad; Ijlal Hussain; Nasir Ullah Khan

doi:10.63878/jalt2212

Authors

Sabeen Amjad Department of English ,SZABIST University Author
Ijlal Hussain Department of English,SZABIST University Author
Nasir Ullah Khan Department of English,SZABIST University Author

DOI:

https://doi.org/10.63878/jalt2212

Abstract

This study examines Roman Urdu and Urdu-English code-mixing. It also examines the problems associated with using online Roman Urdu and code-mixing and digital Urdu. The study also includes the Roman Urdu social media linguistic structure, code-mixing, and Roman Urdu with a focus on Urdu Cricket, Urdu Dramas, and Politics in Urdu. A manual corpus-based research approach combined with other methods was used in this study. This included the collection of 900 posts from YouTube, Facebook, and Twitter (now X). It was found that there is a lot of code-mixing and a lot of Roman Urdu in the social media posts collected for this research. Most of the code-mixing was intra-sentential, as compared to inter-sentential. The study’s analysis classified the social media posts sent with an almost equal balance of positive, negative, or neutral sentiments. Other social media posts dealt with several issues of Natural Language Processing, such as a lack of standard corpus, spelling variation, and linguistic uncertainty. The study, for the first time, also explored the necessity of an organized and advanced Natural Language Processing Technology for the multi-lingual digital space of Pakistan.

"ROMAN URDU AND CODE-MIXED LANGUAGE PROCESSING FOR SOCIAL MEDIA ANALYTICS IN PAKISTAN."

Authors

DOI:

Abstract

Downloads

Published

Issue

Section

License

jalt

HEC

Y Category HEC Recognized

Information

Language

VISITOR