Self-healing method for distributed software in heterogeneous computer systems

DOI: 10.31673/2412-4338.2019.031723

Authors

  • І. В. Рубан, (Ruban I. V.) Kharkiv National University of Radio Electronics, Kharkiv
  • М. О. Волк, (Volk M. O.) Kharkiv National University of Radio Electronics, Kharkiv
  • М. В. Рісухін, (Risukhin M. V.) Kharkiv National University of Radio Electronics, Kharkiv

Abstract

Modern software systems are created and executed in the conditions of continuous evolution of hardware platforms and operating systems. Most of them have a distributed structure, and the software components of the system are hosted on remote heterogeneous computer resources. At some moment one of the programs, computing resource, communications equipment or one of the available services fails. In some cases, there is a likelihood of external influence on the performance of the software components.
In software systems, the term "self-healing" implies the existence of any application, service, or system that may find that it is not working properly, and without any human intervention, makes the necessary changes to restore itself to normal or design state. The system can perform the same actions if its structural elements are likely to fail. The problem is to make a fault-tolerant system that is able to respond to software and hardware changes and self-repair after crashes or take appropriate action in the event of failure.
In in this article levels of software self-healing are considered. Models of distributed software and heterogeneous computer systems are proposed that take into account the architecture of software components and the heterogeneous nature of modern computing resources. A self- healing method for distributed software systems has been developed to restore the functionality of software components in the context of heterogeneous computer resources. The self-healing systems under study can be divided into three levels, depending on the type of resources we track and influence: application software, system software, and hardware level. The developed method allows working on all three levels.

Keywords: self-healing, distributed software, heterogeneous computer systems.

References
1. Schneider, C., Barker, A., and Dobson, S. (2015) “A survey of self‐ healing systems frameworks.” Software: Practice and Experience, 45(10): 1375-1398. Print.
2. Manzoor A., Rajput U, Phulpoto N, Abbas F, Rajput M. (2018) “Self-healing in Operating Systems.” IJCSNS International Journal of Computer Science and Network Security, Vol.18 No.5: 92-98. Print.
3. Hudaib, AA., Fakhouri, HN., Al Adwan, FE., and Fakhouri, SN. (2017) “A Survey about Self-Healing Systems” (Desktop and Web Application). Vol.09 No.01: 71-88. Print.
4. Wang, Z., & Wang, J. (2015) “Self-healing resilient distribution systems based on sectionalization into microgrids.” IEEE Transactions on Power Systems, 30(6): 3139-3149 Print
5 Duarte, DP., Guaraldo, JC., Kagan, H., Nakata, BH., Pranskevicius, PC., Suematsu, AK., and Hoshina, MS. (2016) “Substation-based self-healing system with advanced features for control and monitoring of distribution systems.” In Harmonics and Quality of Power (ICHQP), 2016 17th International Conference on 2016, IEEE: 301-305. Print.
6. Ansari, B., Simoes, MG., Soroudi, A., and Keane, A. (2016) “Restoration strategy in a self-healing distribution network with DG and flexible loads.” In Environment and Electrical Engineering (EEEIC), 2016 IEEE 16th International Conference: 1-5. Print.
7. De Lemos, R., Giese, H., Muller, H.A., Shaw, M., Andersson, J., Litoiu, M., Schmerl, B., Tamura, G., Villegas, N.M., Vogel, T., et al. (2013) “Software engineering for self-adaptive systems: a second research roadmap.” In: Software Engineering for Self-Adaptive Systems II: 1–32 Print
8. Volk М. O. (2010) “The logging of the state of software distributed models and its use in optimistic synchronization algorithms.” Proceedings of Kharkiv University of the Air Force, Vol. 1 (23): 104–107. Print.
9. Filimonchuk T., Volk M., Ruban I., and Tkachov V. (2016) “Development of information technology of tasks distribution for grid-systems using the GRASS simulation environment.” Eastern-European Journal of Enterprise Technologies. Information and controlling system, Vol. 3/9 (81): 5–53. Print.
10. Ruban I, Filimonchuk T, Ivanisenko I, Risukhin M, and Romanenkov Y. (2018) “The Method for Ensuring the Survivability of Distributed Computing in Heterogeneous Computer Systems.” 5th International Scientific-Practical Conference Problems of Infocommunications. Science and Technology (PIC S&T), Kharkiv, Ukraine: 233-238. Print.

Published

2019-11-18

Issue

Section

Articles