Slurm Installation Guide: Difference between revisions

From DISI
Jump to navigation Jump to search
(Created page with "This page will show you how to setup and configure a Slurm queueing system. Useful link: https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/ == Pre-installation ==...")
 
No edit summary
Line 15: Line 15:
=== Install MUNGE ===
=== Install MUNGE ===
MUNGE is authentication service that Slurm uses validating users' credentials.
MUNGE is authentication service that Slurm uses validating users' credentials.
$ sudo yum install munge munge-libs munge-devel
==== (master node only) Create secret key ====
$ dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
$ chown munge:munge /etc/munge/munge.key
$ chmod 400 /etc/munge/munge.key
For worker nodes, scp the munge.key from master node and set the correct ownership and permission
$ scp -p /etc/munge/munge.key hostXXX:/etc/munge/munge.key
==== Set ownership and permission to following directories ====
$ chown -R munge: /etc/munge/ /var/log/munge/
$ chmod 0700 /etc/munge/ /var/log/munge/
==== Start and enable MUNGE daemon at boot time ====
$ systemctl enable munge
$ systemctl start  munge
==== Increase number of MUNGE threads on master node (Optional by recommended on busy server) ====
$ cp /usr/lib/systemd/system/munge.service /etc/systemd/system/munge.service
$ vim /etc/systemd/system/munge.service
Edit this line >> ExecStart=/usr/sbin/munged --num-threads 10
Reload daemon and restart munge
$ systemctl daemon-reload
$ systemctl restart munge

Revision as of 06:40, 4 November 2022

This page will show you how to setup and configure a Slurm queueing system. Useful link: https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/

Pre-installation

Create global user account

Slurm and MUNGE users need to have a consistent UID/GID across all nodes in the cluster. Creating global user accounts must be done before installing the RPMs. It can be done via LDAPAdmin or any services that you use to manage users. If you don't have access to those services, please contact your system administrators.

Install the latest epel-release

CentOS8: dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
CentOS7: yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
RHEL7:   yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Master Node Setup

Install MUNGE

MUNGE is authentication service that Slurm uses validating users' credentials.

$ sudo yum install munge munge-libs munge-devel

(master node only) Create secret key

$ dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
$ chown munge:munge /etc/munge/munge.key
$ chmod 400 /etc/munge/munge.key

For worker nodes, scp the munge.key from master node and set the correct ownership and permission

$ scp -p /etc/munge/munge.key hostXXX:/etc/munge/munge.key

Set ownership and permission to following directories

$ chown -R munge: /etc/munge/ /var/log/munge/
$ chmod 0700 /etc/munge/ /var/log/munge/

Start and enable MUNGE daemon at boot time

$ systemctl enable munge
$ systemctl start  munge

Increase number of MUNGE threads on master node (Optional by recommended on busy server)

$ cp /usr/lib/systemd/system/munge.service /etc/systemd/system/munge.service
$ vim /etc/systemd/system/munge.service
Edit this line >> ExecStart=/usr/sbin/munged --num-threads 10
Reload daemon and restart munge
$ systemctl daemon-reload
$ systemctl restart munge