Slurm Installation Guide
Jump to navigation
Jump to search
This page will show you how to setup and configure a Slurm queueing system. Useful link: https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/
Pre-installation
Create global user account
Slurm and MUNGE users need to have a consistent UID/GID across all nodes in the cluster. Creating global user accounts must be done before installing the RPMs. It can be done via LDAPAdmin or any services that you use to manage users. If you don't have access to those services, please contact your system administrators.
Install the latest epel-release
CentOS8: dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm CentOS7: yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm RHEL7: yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
Master Node Setup
Install MUNGE
MUNGE is authentication service that Slurm uses validating users' credentials.
$ sudo yum install munge munge-libs munge-devel
(master node only) Create secret key
$ dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key $ chown munge:munge /etc/munge/munge.key $ chmod 400 /etc/munge/munge.key
For worker nodes, scp the munge.key from master node and set the correct ownership and permission
$ scp -p /etc/munge/munge.key hostXXX:/etc/munge/munge.key
Set ownership and permission to following directories
$ chown -R munge: /etc/munge/ /var/log/munge/ $ chmod 0700 /etc/munge/ /var/log/munge/
Start and enable MUNGE daemon at boot time
$ systemctl enable munge $ systemctl start munge
Increase number of MUNGE threads on master node (Optional by recommended on busy server)
$ cp /usr/lib/systemd/system/munge.service /etc/systemd/system/munge.service $ vim /etc/systemd/system/munge.service Edit this line >> ExecStart=/usr/sbin/munged --num-threads 10 Reload daemon and restart munge $ systemctl daemon-reload $ systemctl restart munge