Features Elimination for Opinion Classification on Social Networks using Vertical Data Format

Authors

  • Atchara Choompol Kalasin University
  • Mongkol Saensuk สาขาวิชาวิทยาการข้อมูล คณะวิทยาศาสตร์และเทคโนโลยี มหาวิทยาลัยราชภัฏสกลนคร

DOI:

https://doi.org/10.14456/jeit.2023.12

Keywords:

Features Elimination, Opinion Classification, Social Networks, Vertical Data Format

Abstract

This research has developed an algorithm for reducing features without affecting classification, leading to a decrease processing time and enhancing classification accuracy. The reduction of features using the vertical data model method was contrasted with the reduction of features using the chi-square method. The Naïve Bayes method was used to classify opinions. The data used in the research was collected from Stanford Twitter Sentiment Data. The most effective feature reduction method employs a vertical data model provides an average efficiency of 72.75%.

References

[1] C. Troussas, M. Virvou, K. Junshean Espinosa, K. Llaguno, and J. Caro, "Sentiment analysis of Facebook statuses using Naive Bayes classifier for language learning," in Information, Intelligence, Systems and Applications (IISA),Fourth International Conference, 2013, pp. 1-6.

[2] M. Anjaria and R. M. R. Guddeti, "Influence factor based opinion mining of Twitter data using supervised learning," in Communication Systems and Networks (COMSNETS), 2014 Sixth International Conference, 2014, pp. 1-8.

[3] A. Ortigosa, J. M. Martín, and R. M. Carro, "Sentiment analysis in Facebook and its application to e-learning," Computers in Human Behavior, vol. 31, pp. 527-541, 2014.

[4] S. Aslam. "Twitter Statistics [omincore-agency.com]." [Online]. Available: https://www.omnicoreagency.com/twitter-statistics/. [Accessed: 1 March 2020].

[5] H. Saif, Y. He, and H. Alani, "Alleviating data sparsity for Twitter sentiment analysis," in Making Sense of Microposts (#MSM2012): Big things come in small packages at the 21st International Conference on the World Wide Web (WWW'12), Lyon, France, 2012.

[6] A. A. G. M. Karamibekr, "Sentiment Analysis of Social Issues," in International Conference on Social Informatics, Canada, 2012, pp. 215-221.

[7] J. Yang, Y. Liu, X. Zhu, Z. Liu, and X. Zhang, "A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization," Inf. Process. Manage, vol. 48, pp. 741-754, 2012.

[8] Q. Song, J. Ni, and G. Wang, "A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data," IEEE Trans. on Knowl. and Data Eng., vol. 25, pp. 1-14, 2013.

[9] S. Das, "Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection," in the Proceedings of the Eighteenth International Conference on Machine Learning, 2001.

[10] J. C. Hall, "A Linguistic Model for Improving Sentiment Analysis Systems," Master of ScienceThesis, North Dakota State University, Fargo, North Dakota, 2014.

[11] B. Liu, Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, 2012.

[12] เอกสิทธิ์ พัชรวงศ์ศักดา, การวิเคราะห์ข้อมูลด้วยเทคนิคดาต้า ไมน์นิง เบื้องต้น, 1 ed. กรุงเทพฯ:: เอเชีย ดิจิตอลการพิมพ์, 2557.

[13] A. Go, L. Huang, and R. Bhayani, Twitter sentiment analysis. 2009.

Downloads

Published

2023-06-29

How to Cite

[1]
A. Choompol and M. . Saensuk, “Features Elimination for Opinion Classification on Social Networks using Vertical Data Format”, JEIT, vol. 1, no. 3, pp. 38–45, Jun. 2023.