Network Traffic Classification Using Feature Selections and two-tier stacked classifier

##plugins.themes.academic_pro.article.main##

Rahul Adhao
Vinod Pachghare

Abstract

The datasets available for IDS performance evaluations are noisy and highly imbalanced. The noisiness of the dataset can be reduced with dataset pre-processing and feature selection approach. These datasets contain many records for some class labels (e.g., DoS, DDoS, Port Scan: majority attacks) and very few records for other class labels (e.g., U2R, R2L: minority attacks), making it imbalanced. Applying a single machine learning algorithm (classifier) on such datasets confuses the classifiers. The classifier becomes biased towards majority attack records and may fail to detect minority attacks. One possible solution to reduce these class imbalances of the dataset is to divide this dataset in terms of majority and minority attacks. The proposed approach divides the dataset into majority and minority groups to solve the issue raised by the imbalance dataset and uses two-tier classification approaches to classify majority and minority attacks. The CICIDS2017 dataset and NSL-KDD dataset are used for the evaluation of the proposed system. The proposed system gives an accuracy of 98.30% for the CICIDS 2017 dataset and 99.71% for the NSL-KDD dataset.  The model’s performance is explored in terms of precision, accuracy, and F1 score, which has been observed to be superior to existing works in the field of intrusion detection.

##plugins.themes.academic_pro.article.details##

How to Cite
Adhao, R., & Pachghare, V. . (2021). Network Traffic Classification Using Feature Selections and two-tier stacked classifier. International Journal of Next-Generation Computing, 12(5). https://doi.org/10.47164/ijngc.v12i5.422

References

  1. ADHAO, R. B., & PACHGHARE, V. K. (2019, December). Performance-Based Feature Selection Using Decision Tree. In 2019 International Conference on Innovative Trends and Advances in Engineering and Technology (ICITAET) (pp. 135-138). IEEE. DOI: https://doi.org/10.1109/ICITAET47105.2019.9170235
  2. BINBUSAYYIS, A., & VAIYAPURI, T. (2019). Identifying and benchmarking key features for cyber intrusion detection: an ensemble approach. IEEE Access, 7, 106495-106513. DOI: https://doi.org/10.1109/ACCESS.2019.2929487
  3. BREWSTER, L. R., DALE, J. J., GUTTRIDGE, T. L., GRUBER, S. H., HANSELL, A. C., ELLIOTT, M., ... & GLEISS, A. C. 2018. Development and application of a machine learning algorithm for classification of elasmobranch behaviour from accelerometry data. Marine biology, 165(4), 1-19. DOI: https://doi.org/10.1007/s00227-018-3318-y
  4. CHARBUTY, B., & ABDULAZEEZ, A. 2021. Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(01), 20-28. DOI: https://doi.org/10.38094/jastt20165
  5. GANDOMI, A. H., & ALAVI, A. H. 2012. Krill herd: a new bio-inspired optimization algorithm. Communications in nonlinear science and numerical simulation, 17(12), 4831-4845. DOI: https://doi.org/10.1016/j.cnsns.2012.05.010
  6. GNANAPRASANAMBIKAI, L., & MUNUSAMY, N. 2018. Data Pre-Processing and Classification for Traffic Anomaly Intrusion Detection Using NSLKDD Dataset. Cybernetics and Information Technologies, 18(3), 111-119. DOI: https://doi.org/10.2478/cait-2018-0042
  7. PACHGHARE, V. K., KHATAVKAR, V. K., & KULKARNI, P. A. 2012. Pattern based network security using semi-supervised learning. International Journal of Information and Network Security, 1(3), 228. DOI: https://doi.org/10.11591/ijins.v1i3.704
  8. PANIGRAHI, R., & BORAH, S. 2018. A detailed analysis of CICIDS2017 dataset for designing Intrusion Detection Systems. International Journal of Engineering & Technology, 7(3.24), 479-482.
  9. PRATIWI, A. I. 2018. On the feature selection and classification based on information gain for document sentiment analysis. Applied Computational Intelligence and Soft Computing, 2018. DOI: https://doi.org/10.1155/2018/1407817
  10. SHARAFALDIN, I., LASHKARI, A. H., & GHORBANI, A. A. 2018. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1, 108-116. DOI: https://doi.org/10.5220/0006639801080116
  11. STIAWAN KURNIABUDI, D., IDRIS, M. Y. B., BAMHDI, A. M., & BUDIARTO, R. (2020). CICIDS-2017 dataset feature analysis with information gain for anomaly detection. IEEE Access, 8, 132911-132921. DOI: https://doi.org/10.1109/ACCESS.2020.3009843
  12. UMER, M. F., SHER, M., & BI, Y. 2018. A two-stage flow-based intrusion detection model for next-generation networks. PloS one, 13(1), e0180945. DOI: https://doi.org/10.1371/journal.pone.0180945
  13. WANKHEDE, S., & KSHIRSAGAR, D. (2018, August). DoS attack detection using machine learning and neural network. In 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) (pp. 1-5). IEEE. DOI: https://doi.org/10.1109/ICCUBEA.2018.8697702