Online Shoppers' Purchase Intention using Ensemble Learning Approach

##plugins.themes.academic_pro.article.main##

Jhimli Adhikari

Abstract

The customer’s purchase intention can be predicted by analyzing the history of the customers. In this study, we have analyzed the data of online shoppers for building a better prediction model to predict their purchase intention. Initially exploratory data analysis is performed on the dataset. We have used different ensemble algorithms such as Random Forest, Gradient Boosting, XGBoost and LightGBM to predict whether a customer, visiting the website of an online shop, will end up with a purchase or not. Later we have performed ensemble methods to boost up the performance of the algorithms using SMOTE to overcome class imbalance. Lastly model performance evaluation is done using parameter tuning. Study has shown that XGBoost model predicts with 89.97 % accuracy on imbalanced data, whereas Random Forest displays 93% accuracy after using SMOTE to predict the customer’s purchase intention. Moreover, XGBoost shows the highest accuracy, which is 93.54% after parameter tuning. Execution time of the models is also observed in the study.

##plugins.themes.academic_pro.article.details##

How to Cite
Adhikari, J. (2023). Online Shoppers’ Purchase Intention using Ensemble Learning Approach. International Journal of Next-Generation Computing, 14(4). https://doi.org/10.47164/ijngc.v14i4.1065

References

  1. Rao, Y., Saleem, A., Saeed, W., & Ul Haq, J. (2021). Online consumer satisfaction during COVID-19: perspective of a developing country. Frontiers in Psychology, 12, 751854. DOI: https://doi.org/10.3389/fpsyg.2021.751854
  2. García-Salirrosas, E.E.; Acevedo-Duque, Á.; Marin Chaves, V.; Mejía Henao, P.A.; Olaya Molano, J.C. Purchase Intention and Satisfaction of Online Shop Users in Developing Countries during the COVID-19 Pandemic. Sustainability 2022, 14, 6302. https://doi.org/ 10.3390/su14106302 DOI: https://doi.org/10.3390/su14106302
  3. Martínez, A., Schmuck, C., Pereverzyev, S., Pirker, C., & Haltmeier, M. (2020). A machine learning framework for customer purchase prediction in the non-contractual setting. European Journal of Operational Research, 281(3), 588–596. https://doi.org/10.1016/j.ejor.2018.04.034. DOI: https://doi.org/10.1016/j.ejor.2018.04.034
  4. Zeng, M., Cao, H., Chen, M., & Li, Y. (2019). User behaviour modeling, recommendations, and purchase prediction during shopping festivals. Electronic Markets, 29(2), 263–274. https://doi.org/10.1007/s12525-018-0311-8 DOI: https://doi.org/10.1007/s12525-018-0311-8
  5. Wu, Z., Tan, B.H., Duan, R., Liu, Y., Mong Goh, R.S. (2015). Neural modeling of buying behaviour for E-commerce from click ing patterns. In Proceedings of the international ACM recom mender systems challenge 2015 (p. 12). https://doi.org/10.1145/2813448.2813521. DOI: https://doi.org/10.1145/2813448.2813521
  6. Mokryn, O., Bogina, V., Kuflik, T. (2019). Will this session end with a purchase? Inferring current purchase intent of anonymous visitors. Electronic Commerce Research and Applications, 34, 100 836. https://doi.org/10.1016/j.elerap.2019.100836. DOI: https://doi.org/10.1016/j.elerap.2019.100836
  7. S. Mootha, S. Sridhar and M. S. K. Devi, "A Stacking Ensemble of Multi Layer Perceptrons to Predict Online Shoppers' Purchasing Intention," 2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), 2020, pp. 721-726, doi: 10.1109/ISRITI51436.2020.9315447. DOI: https://doi.org/10.1109/ISRITI51436.2020.9315447
  8. Baati, K., Mohsil, M. (2020). Real-Time Prediction of Online Shoppers’ Purchasing Intention Using Random Forest. In: Maglogiannis, I., Iliadis, L., Pimenidis, E. (eds) Artificial Intelligence Applications and Innovations. AIAI 2020. IFIP Advances in Information and Communication Technology, vol 583. Springer, Cham. https://doi.org/10.1007/978-3-030-49161-1_4 DOI: https://doi.org/10.1007/978-3-030-49161-1_4
  9. Sakar, C.O., Polat, S.O., Katircioglu, M. et al. Real-time prediction of online shoppers’ purchasing intention using multilayer perceptron and LSTM recurrent neural networks. Neural Comput & Applic 31, 6893–6908 (2019). https://doi.org/10.1007/s00521-018-3523-0 DOI: https://doi.org/10.1007/s00521-018-3523-0
  10. Noviantoro, T., and Huang, J. P. (2021). Applying Data Mining Techniques to Investigate Online Shopper Purchase Intention Based on Clickstream Data. Review of Business, Accounting, & Finance, 1(2), 130-159.
  11. Zhou, Z.-H. (2012). Ensemble Methods: Foundations and Algorithms (1st ed.). Chapman and Hall/CRC. https://doi.org/10.1201/b12207 DOI: https://doi.org/10.1201/b12207
  12. C. Zhang and Y. Ma (eds.), Ensemble Machine Learning: Methods and Applications, DOI 10.1007/978-1-4419-9326-7 1, © Springer Science+Business Media, LLC 2012
  13. Pretorius, A., Bierman, S. and Steel, S.J., 2016, November. A bias-variance analysis of ensemble learning for classification. In Annual Proceedings of the South African Statistical Association Conference (Vol. 2016, No. con-1, pp. 57-64). South African Statistical Association (SASA).
  14. Panagiotis Pintelas Ioannis E. Livieris (eds.), Ensemble Algorithms and Their Applications, Mdpi, 2020, ISBN 978-3-03936-958-4 (Hbk); ISBN 978-3-03936-959-1, https://doi.org/10.3390/books978-3-03936-959-1 DOI: https://doi.org/10.3390/books978-3-03936-959-1
  15. Ensemble Learning Algorithms With Python: Make Better Predictions with Bagging, Boosting, and Stacking, Brownlee, J., https://books.google.co.in/books?id=IUkrEAAAQBAJ, 2021, Machine Learning Mastery
  16. Bonaccorso, G. (2017). Machine Learning Algorithms. United Kingdom: Packt Publishing, 9781785889622.
  17. Faul AC. 2020. A concise introduction to machine learning. Boca Raton: CRC Press, ISBN 9780815384106, by Chapman & Hall, 334 Pages.
  18. Tianqi Chen and Carlos Guestrin, KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2016) Pages 785–794, https://doi.org/10.1145/2939672.2939785 DOI: https://doi.org/10.1145/2939672.2939785
  19. UCI Machine Learning Repository. Available :https://archive.ics.uci.edu/.
  20. Krawczyk B. Learning from imbalanced data: open challenges and future directions. Prog Artif Intell. 2016; 5(4): 221–32. https://doi.org/10.1007/s13748-016-0094-0 DOI: https://doi.org/10.1007/s13748-016-0094-0
  21. Amalia Luque, Alejandro Carrasco, Alejandro Martín, Ana de las Heras, The impact of class imbalance in classification performance metrics based on the binary confusion matrix, Pattern Recognition, Volume 91, 2019, Pages 216-231, ISSN 0031-3203, https://doi.org/10.1016/j.patcog.2019.02.023 DOI: https://doi.org/10.1016/j.patcog.2019.02.023
  22. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer,W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002) DOI: https://doi.org/10.1613/jair.953