Slurm Installation Guide: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
Line 12: Line 12:
  RHEL7:  yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
  RHEL7:  yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm


== Master Node Setup ==
== Install MUNGE ==
=== Install MUNGE ===
MUNGE is authentication service that Slurm uses validating users' credentials.
MUNGE is authentication service that Slurm uses validating users' credentials.
  $ sudo yum install munge munge-libs munge-devel
  $ sudo yum install munge munge-libs munge-devel
==== (master node only) Create secret key ====
=== (master node only) Create secret key ===
  $ dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
  $ dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
  $ chown munge:munge /etc/munge/munge.key
  $ chown munge:munge /etc/munge/munge.key
Line 22: Line 21:
For worker nodes, scp the munge.key from master node and set the correct ownership and permission
For worker nodes, scp the munge.key from master node and set the correct ownership and permission
  $ scp -p /etc/munge/munge.key hostXXX:/etc/munge/munge.key
  $ scp -p /etc/munge/munge.key hostXXX:/etc/munge/munge.key
==== Set ownership and permission to following directories ====
=== Set ownership and permission to following directories ===
  $ chown -R munge: /etc/munge/ /var/log/munge/
  $ chown -R munge: /etc/munge/ /var/log/munge/
  $ chmod 0700 /etc/munge/ /var/log/munge/
  $ chmod 0700 /etc/munge/ /var/log/munge/
==== Start and enable MUNGE daemon at boot time ====
=== Start and enable MUNGE daemon at boot time ===
  $ systemctl enable munge
  $ systemctl enable munge
  $ systemctl start  munge
  $ systemctl start  munge
==== Increase number of MUNGE threads on master node (Optional by recommended on busy server) ====
=== Increase number of MUNGE threads on master node (Optional by recommended on busy server) ===
  $ cp /usr/lib/systemd/system/munge.service /etc/systemd/system/munge.service
  $ cp /usr/lib/systemd/system/munge.service /etc/systemd/system/munge.service
  $ vim /etc/systemd/system/munge.service
  $ vim /etc/systemd/system/munge.service
Line 35: Line 34:
  $ systemctl daemon-reload
  $ systemctl daemon-reload
  $ systemctl restart munge
  $ systemctl restart munge
== Install Slurm ==
Although slurm is available on epel. It is better to build from RPMs to ensure we have the latest update.
This still shows you how to set up slurm with accounting (slurmdbd using MySQL as database). Accounting is optional and can be skipped, but it is useful for keeping records of job and managing resources.
=== Install prerequisite packages ===
$ yum install rpm-build gcc python3 openssl openssl-devel pam-devel numactl numactl-devel hwloc hwloc-devel munge munge-libs munge-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel gtk2-devel libibmad libibumad perl-Switch perl-ExtUtils-MakeMaker xorg-x11-xauth http-parser-devel json-c-devel
If you are setting up slurmdbd, you will also need
$ yum install mariadb-server mariadb-devel
$ wget https://download.schedmd.com/slurm/slurm-22.05.5.tar.bz2

Revision as of 07:03, 4 November 2022

This page will show you how to setup and configure a Slurm queueing system. Useful link: https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/

Pre-installation

Create global user account

Slurm and MUNGE users need to have a consistent UID/GID across all nodes in the cluster. Creating global user accounts must be done before installing the RPMs. It can be done via LDAPAdmin or any services that you use to manage users. If you don't have access to those services, please contact your system administrators.

Install the latest epel-release

CentOS8: dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
CentOS7: yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
RHEL7:   yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Install MUNGE

MUNGE is authentication service that Slurm uses validating users' credentials.

$ sudo yum install munge munge-libs munge-devel

(master node only) Create secret key

$ dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
$ chown munge:munge /etc/munge/munge.key
$ chmod 400 /etc/munge/munge.key

For worker nodes, scp the munge.key from master node and set the correct ownership and permission

$ scp -p /etc/munge/munge.key hostXXX:/etc/munge/munge.key

Set ownership and permission to following directories

$ chown -R munge: /etc/munge/ /var/log/munge/
$ chmod 0700 /etc/munge/ /var/log/munge/

Start and enable MUNGE daemon at boot time

$ systemctl enable munge
$ systemctl start  munge

Increase number of MUNGE threads on master node (Optional by recommended on busy server)

$ cp /usr/lib/systemd/system/munge.service /etc/systemd/system/munge.service
$ vim /etc/systemd/system/munge.service
Edit this line >> ExecStart=/usr/sbin/munged --num-threads 10
Reload daemon and restart munge
$ systemctl daemon-reload
$ systemctl restart munge

Install Slurm

Although slurm is available on epel. It is better to build from RPMs to ensure we have the latest update.

This still shows you how to set up slurm with accounting (slurmdbd using MySQL as database). Accounting is optional and can be skipped, but it is useful for keeping records of job and managing resources.

Install prerequisite packages

$ yum install rpm-build gcc python3 openssl openssl-devel pam-devel numactl numactl-devel hwloc hwloc-devel munge munge-libs munge-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel gtk2-devel libibmad libibumad perl-Switch perl-ExtUtils-MakeMaker xorg-x11-xauth http-parser-devel json-c-devel

If you are setting up slurmdbd, you will also need

$ yum install mariadb-server mariadb-devel


$ wget https://download.schedmd.com/slurm/slurm-22.05.5.tar.bz2