Detection of traffic anomalies in the information systems of organizations using Machine Learning methods on the base of algorithms for forecasting category fields
DOI: 10.31673/2412-4338.2021.044153
Abstract
The article examines the problem of detecting anomalies in the network traffic of information systems of organizations. Detection of anomalies in network traffic will allow to determine the hidden malicious activity of the data obtained on the basis of protocols that collect statistical data of the network traffic of the information system. This, in turn, will allow you to reduce the load and configure the attributes that will be used to monitor and analyze network traffic. The authors proposed the network traffic anomaly detection architecture, which is divided into functional levels. Protocols were analyzed to collect statistics, namely the Net Flow/IPFIX protocol, which provides comprehensive information based on packet headers. To process and analyze the received data, the authors developed a model for detecting anomalies in the traffic of the information system. The anomaly detection model uses statistical data for their further processing, as well as the possibility of storing data in a repository. All received data is filtered to detect malicious processes, transferred and stored in the repository of the attack database with the possibility of creating warnings and identifying the attack. For the specified model, the use of Machine Learning based on methods of predicting categorical fields is proposed. The work used a dataset with firewall data, which contains information on the number and size of transmitted and received packets of packets, data on the use of malicious software. Using the method, an experimental study of the data was conducted to predict the presence of malicious software in them. The method of forecasting categorical fields using Logistic Regression, SVM, Random Forest Classifier and other classification algorithms was investigated. Based on the obtained data, a confusion matrix was built, which allows to estimate the error of the algorithms.
Keywords: information system, anomaly, attack, method, model, algorithm, machine learning.
References
1 Detecting Abnormal Cyber Behavior Before a Cyberattack. March 5, 2021. Online: https://www.nist.gov/blogs/manufacturing-innovation-blog/detecting-abnormal-cyber-behavior-cyberattack (viewed on July, 27, 2021).
2. Haydur H.I., Gakhov S.O., Marchenko V.V. The method of building a dynamic model of a logical object of the information system and determining the law of its functioning. Radioelectronic and Computer Systems, no. 1(101). pp. 129-14, 2022. doi: 10.32620/reks.2022.1.10.
3. Qian Ma, Cong Sun, Baojiang Cui, A Novel Model for Anomaly Detection in Network Traffic Based on Support Vector Machine and Clustering Security and Communication Networks. Security and Communication Networks. Volume 2021. doi: 10.1155/2021/2170788. 4. Kazmirchuk S.V., Korchenko A.O., Paraschuk T.I. Analysis of intrusion detection systems. Protection of information. Volume. 20 № 4 (2018), pp. 259-276. 2018.
5. O. Lawal, "Analysis and Evaluation of Network Based Intrusion Detectionand Prevention System in an Enterprise Network Using Snort Freeware", African Journal of Computing & ICT, Ibadan, Vol. 6, no. 2, pp. 169-184, 2013.
6. S. Cooper, 11 Top Intrusion Detection Tools for 2021. [Electronic resource]. Online: https://www. comparitech.com/net-admin/network-intrusion-detection-tools/ (viewed on July, 27, 2021).
7. Anna Korchenko, Methods of identifying abnormal states for intrusion detection systems. Monograph. Kyiv. TSP "Komprynt", 361 p., 2019.
8. RFC 7011. Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information, September 2013.
9. M. V. Mahoney and P. K. Chan, “Learning rules for anomaly detection of hostile network traffic,” in Proceedings of the Third IEEE International Conference on Data Mining, pp. 601–604, IEEE, Leipzig, Germany, July 2003.
10. Haydur H.I., Hakhov S.O. A theoretical approach to solving the problem of detecting malicious processes based on the analysis of the states of the logical object of the information system. Telecommunications and information technologies. №1 (70), pp.79-87. 2021.
11. E. Eskin, Anomaly Detection over Noisy Data Using Learned Probability Distributions, Citeseer, Princeton, New Jersey, USA, 2000.
12. W. Lee and D. Xiang, “Information-theoretic measures for anomaly detection,” in Proceedings of the 2001 IEEE Symposium on Security and Privacy, S&P 2001, pp. 130–143, IEEE, Philadelphia, PA, USA, November 2000.
13.M. A. Ambusaidi, Z. Tan, X. He, P. Nanda, L. F. Lu, and A. Jamdagni, “Intrusion detection method based on nonlinear correlation measure,” International Journal of Internet Protocol Technology, vol. 8, no. 2-3, pp. 77–86, 2014.
14. Splunk® Machine Learning Toolkithttps://docs.splunk.com/Documentation/MLApp (viewed on September, 18, 2021).
15. M. Ahmed, A. Naser Mahmood, and J. Hu, “A survey of network anomaly detection techniques,” Journal of Network and Computer Applications, vol. 60, pp. 19–31, 2016.
16. Clarence Chio, David Freeman, Machine Learning and Security, O'Reilly Media, Inc. 118. 2018. ISBN: 9781491979907
17. Alexey Nefedov, Support Vector Machines: A Simple Tutorial, Creative Commons Attribution, 32, 2016.
18. Network Behavior Analysis: Moving Beyond Signatures https://www.gartner.com/en/documents/1405498. (viewed on October, 2, 2021).