Sentiment Orientation from Code-mixed Social Media Data
##plugins.themes.academic_pro.article.main##
Abstract
Collecting and evaluating data is becoming an effectively admissible challenge in the highly connected world. In the 21st century, with the advent of social networks getting popular, the social media information is getting archived at alarming rates. The use of local language in informal fashion is very common on social media platform. In natural language processing, Sentiment Analysis (SA) is a specialized process of determining user orientation from opinion data floating on social web. Code-mixed social media data in specific is challenging to process, due to mixing of varied languages used to portray the linguistic efficiency. In this paper, we propose a model called Code-mixed Sentiment Analyzer (cmSentiAnalyzer) to derive sentiment orientation from code-mixed sentences. Our proposed model has used language features across code-mixed languages to map the words occurring in different languages to a common space. Our experiments reveal that cmSentiAnalyzer outperforms baseline approaches in sentiment analysis for code-mixed text by 2% in accuracy and 89% of average precision.
##plugins.themes.academic_pro.article.details##
How to Cite
Kavita Sanjay Asnani, & Floyd Avina Fernandes. (2021). Sentiment Orientation from Code-mixed Social Media Data. International Journal of Next-Generation Computing, 12(1), 22–29. https://doi.org/10.47164/ijngc.v12i1.187
References
- Sharma, Shashank, P. Y. K. L. Srinivas, and Rakesh Chandra Balabantaray. "Text normalization of code mix and sentiment analysis." In 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1468-1473. IEEE, 2015.
- Thara, S., and Prabaharan Poornachandran. "Code-Mixing: A Brief Survey." In 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2382-2388. IEEE, 2018.
- Pravalika, A., Vishvesh Oza, N. P. Meghana, and S. Sowmya Kamath. "Domain-specific sentiment analysis approaches for code-mixed social network data." In 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1-6. IEEE, 2017.
- Malgaonkar, Saurabh, Aejazul Khan, and Abhishek Vichare. "Mixed bilingual social media analytics: case study: Live Twitter data." In 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1407-1412. IEEE, 2017.
- Bhargava, Rupal, Yashvardhan Sharma, and Shubham Sharma. "Sentiment analysis for mixed script indic sentences." In 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 524-529. IEEE, 2016
- Das, Amitava, and Björn Gambäck. "Identifying languages at the word level in code-mixed indian social media text." In Proceedings of the 11th International Conference on Natural Language Processing, pp. 378-387. 2014.
- M.A. Ansari, S.Govilkar, “Sentiment Analysis of Mixed Code for the Transliterated Hindi and Marathi Texts” In 2018 International Journal on Natural Language Computing(IJNLC), Vol. 7, No.2, April, pp. 15-28 .
- Sharma, Shashank, P. Y. K. L. Srinivas, and Rakesh Chandra Balabantaray. "Sentiment analysis of code-mix script." In 2015 International Conference on Computing and Network Communications (CoCoNet), pp. 530-534. IEEE, 2015.
- Floyd A. F., Kavita A., “Parts Of Speech Tagging for Indic Languages: A Survey”, In 2019 International Journal of Computer Sciences and Engineering, Vol. 7, Issue 3, March, pp. 729-736.
- Jamatia, Anupam, Amitava Das, and Björn Gambäck. "Deep Learning-Based Language Identification in English-Hindi-Bengali Code-Mixed Social Media Corpora." Journal of Intelligent Systems 28, no. 3 (2019): 399-408.
- Joshi, Aravind K. "Processing of sentences with intra-sentential code-switching." In Proceedings of the 9th conference on Computational linguistics-Volume 1, pp. 145-150. Academia Praha, 1982.
- Jhanwar, Madan Gopal, and Arpita Das. "An Ensemble Model for Sentiment Analysis of Hindi-English Code-Mixed Data." arXiv preprint arXiv:1806.04450 (2018).
- Chittaranjan, Gokul, Yogarshi Vyas, Kalika Bali, and Monojit Choudhury. "Word-level language identification using crf: Code-switching shared task report of msr india system." In Proceedings of The First Workshop on Computational Approaches to Code Switching, pp. 73-79. 2014.
- Vyas, Yogarshi, Spandana Gella, Jatin Sharma, Kalika Bali, and Monojit Choudhury. "Pos tagging of english-hindi code-mixed social media content." In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 974-979. 2014.
- Kumar, Upendra, Vishal Singh, Chris Andrew, Santhoshini Reddy, and Amitava Das. "Consonant-Vowel Sequences as Subword Units for Code-Mixed Languages." In Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
- Sequiera, Royal, Monojit Choudhury, Parth Gupta, Paolo Rosso, Shubham Kumar, Somnath Banerjee, Sudip Kumar Naskar et al. "Overview of FIRE-2015 Shared Task on Mixed Script Information Retrieval." In FIRE Workshops, vol. 1587, pp. 19-25. 2015.
- Solorio, Thamar, Elizabeth Blair, Suraj Maharjan, Steven Bethard, Mona Diab, Mahmoud Ghoneim, Abdelati Hawwari et al. "Overview for the first shared task on language identification in code-switched data." In Proceedings of the First Workshop on Computational Approaches to Code Switching, pp. 62-72. 2014.
- Rao, Pattabhi RK, and Sobha Lalitha Devi. "CMEE-IL: Code Mix Entity Extraction in Indian Languages from Social Media Text@ FIRE 2016-An Overview." In FIRE (Working Notes), pp. 289-295. 2016.
- P. Arora, “Sentiment Analysis For Hindi Language”, MS by Research in Computer Science, Master Thesis, IIT Hyderabad, pp. 1-63, 2013.
- Raghavi, Khyathi Chandu, Manoj Kumar Chinnakotla, and Manish Shrivastava. "Answer ka type kya he?: Learning to classify questions in code-mixed language." In Proceedings of the 24th International Conference on World Wide Web, pp. 853-858. ACM, 2015.
- V.S. Kamble, S.N. Deshmukh, “PMI Based Sentiment Analysis with SVM Cross Validation ”, In International Journal of Engineering Science and Computing, July, pp. 8579-8582.