USING SLURM WORKLOAD MANAGER FOR MANAGING SUPERCOMPUTERS AND LINUX CLUSTERS

DOI: 10.31673/2412-4338.2022.024652

Authors

  • В. О. Дрига, (Dryha V. O.) State University of Telecommunications, Kyiv
  • М. Д. Бриксіна, (Bryksina M. D.) State University of Telecommunications, Kyiv

Abstract

The article is dedicated to the use of the Slurm Workload Manager for managing supercomputers and Linux clusters and highlighting the importance and advantages of using the Slurm Workload Manager in resource management on multi-user systems. Within the research, existing workload management systems and their limitations were analyzed. Based on this analysis, it was found that Slurm is one of the most widely used and efficient solutions in the field of resource management on multi-user systems.
The article provides a detailed examination of the functionality of Slurm, including its core functions such as resource scheduling, task distribution, and system monitoring. Slurm allows users to efficiently utilize computational resources, distribute tasks among cluster nodes, ensure optimal CPU time usage, and control the system load. These features enable high productivity and efficiency in resource utilization.
The advantages of Slurm compared to other resource management systems are presented in the article. Slurm is noted for its flexibility and the ability to configure various types of resources, as well as support for different scheduling algorithms. Limitations and challenges associated with using Slurm are also mentioned, providing readers with a comprehensive understanding of its capabilities and potential considerations for implementation.
This article will provide readers with a detailed overview of the Slurm Workload Manager, its core functions, advantages, and limitations. With the comprehensive analysis and description of Slurm's core functions, the article will serve as an invaluable source of information for professionals working with large-scale computing clusters and supercomputers. It will also be beneficial for those interested in exploring best practices in resource management on multi-user systems and learning effective strategies for utilizing the Slurm Workload Manager.

Keywords: SLURM, Workload Manager, Supercomputers, Linux Clusters, Resource Management, Task Allocation, Parallel Computing, High Performance Computing, Configuration Templates, Resource Monitoring.

References
1. Morgan Tim. Slurm Workload Manager: User's Guide, Version 20.02.7. 2020.
2. McKeland Michael. Keruvannya klastyerom z vykorystannyam Slurm. Linux Magazine, 2016, No. 187, 76-79 s. [in Ukrainian].
3. Topchiiev Denis, Demchenko Andrii. Vykorystannia Slurm Workload Manager dlia keruvannia resursamy v hrid-systemakh. Materialy dopovidei KhVІ Mizhnarodnoi naukovotekhnichnoi konferentsii "Suchasni informatsiini tekhnolohii ta innovatsiini metodyky navchannia u pidhotovtsi fakhivtsiv riznykh napriamkiv diialnosti". Kyiv, 2017, 123-127 s. [in Ukrainian].
4. Jette, M. A., & Wienke, S. (2015). Slurm: Simple Linux utility for resource management. In High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation (pp. 269-283). Springer.
5. Lantz, E., Levesque, J., & Seidel, E. (2017). Practical Introduction to Slurm. In Proceedings of the Practice and Experience on Advanced Research Computing (p. 1). ACM.
6. Zhou, H., Wang, J., Zhang, J., & Tang, X. (2019). Performance optimization and evaluation of slurm workload manager. Journal of Ambient Intelligence and Humanized Computing, 10(4), pp. 1517-1526.
7. Chowdhury, S., Santos, R. M., Tordsson, J., & Elmroth, E. (2011). A survey on scheduling techniques for SLA-driven elastic applications in cloud computing. IEEE Transactions on Cloud Computing, 1(2), pp. 110-128.
8. Sterling, T., & Grant, R. (2019). Beowulf Cluster Computing with Linux (2nd ed.). MIT Press.

Published

2023-09-25

Issue

Section

Articles