<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://wiki.docking.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dudenko</id>
	<title>DISI - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="http://wiki.docking.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dudenko"/>
	<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Special:Contributions/Dudenko"/>
	<updated>2026-05-23T12:25:30Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.39.1</generator>
	<entry>
		<id>http://wiki.docking.org/index.php?title=ZINC22:Downloading&amp;diff=13126</id>
		<title>ZINC22:Downloading</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=ZINC22:Downloading&amp;diff=13126"/>
		<updated>2020-12-17T16:11:16Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Here is how to download ZINC22. &lt;br /&gt;
&lt;br /&gt;
* wget, curl&lt;br /&gt;
&lt;br /&gt;
* rsync&lt;br /&gt;
&lt;br /&gt;
 An example of keeping H06-tranche up to date:&lt;br /&gt;
 rsync -L  -a --progress --prune-empty-dirs --delete-excluded --include=&amp;quot;*/&amp;quot; --include=&amp;quot;H06P???-?.smi.gz&amp;quot; --exclude=&amp;quot;*&amp;quot;  rsync://files.docking.org/ZINC22-3D/H06 .&lt;br /&gt;
&lt;br /&gt;
* Globus&lt;br /&gt;
&lt;br /&gt;
* AWS&lt;br /&gt;
&lt;br /&gt;
* Wynton&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:ZINC22]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12934</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12934"/>
		<updated>2020-09-17T17:55:25Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039; (Installation of the latest slurm version (20.02.04)? see below in &amp;quot;Migrating to gimel5&amp;quot; section)&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Node down after reboot&#039;&#039;&#039;&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Migrating to gimel5 ====&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpm-build gcc openssl openssl-devel libssh2-devel pam-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel gtk2-devel libssh2-devel libibmad libibumad perl-Switch perl-ExtUtils-MakeMaker mysql-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Intallation: on a compute node&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;systemctl enable slurmd&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;systemctl start slurmd&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12924</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12924"/>
		<updated>2020-09-11T16:10:17Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039; (Installation of the latest slurm version (20.02.04)? see below in &amp;quot;Migrating to gimel5&amp;quot; section)&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Node down after reboot&#039;&#039;&#039;&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Migrating to gimel5 ====&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel mysql-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Intallation: on a compute node&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;systemctl enable slurmd&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;systemctl start slurmd&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12923</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12923"/>
		<updated>2020-09-11T12:39:00Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039; (Installation of the latest slurm version (20.02.04)? see below in &amp;quot;Migrating to gimel5&amp;quot; section)&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Node down after reboot&#039;&#039;&#039;&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Migrating to gimel5 ====&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel mysql-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mariadb; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Intallation: on a compute node&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;systemctl enable slurmd&amp;lt;br&amp;gt;&lt;br /&gt;
* &#039;&#039;systemctl start slurmd&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12922</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12922"/>
		<updated>2020-09-11T12:38:19Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039; (Installation of the latest slurm version (20.02.04)? see below in &amp;quot;Migrating to gimel5&amp;quot; section)&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Node down after reboot&#039;&#039;&#039;&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Migrating to gimel5 ====&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel mysql-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mariadb; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Intallation: on a compute node&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
systemctl enable slurmd&amp;lt;br&amp;gt;&lt;br /&gt;
systemctl start slurmd&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12921</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12921"/>
		<updated>2020-09-11T12:35:18Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039; (Installation of the latest slurm version (20.02.04)? see below in &amp;quot;Migrating to gimel5&amp;quot; section)&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Node down after reboot&#039;&#039;&#039;&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Migrating to gimel5 ====&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel mysql-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mariadb; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Intallation: on a compute node&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12920</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12920"/>
		<updated>2020-09-11T12:31:50Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039; (Installation of the latest slurm version (20.02.04)? see below in &amp;quot;Migrating to gimel5&amp;quot; section)&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Node down after reboot&#039;&#039;&#039;&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Migrating to gimel5 ====&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel mysql-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mariadb; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
On a compute node&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12919</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12919"/>
		<updated>2020-09-11T12:28:53Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039; (Installation of the latest slurm version (20.02.04)? see below in &amp;quot;Migrating to gimel5&amp;quot; section)&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Migrating to gimel5&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mariadb; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
On a compute node&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12918</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12918"/>
		<updated>2020-09-11T12:26:31Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039; (Installation of the latest slurm version (20.02.04)? see below.)&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Migrating to gimel5&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mariadb; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
On a compute node&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12917</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12917"/>
		<updated>2020-09-11T12:25:12Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
p.s. Installation of the latest slurm version (20.02.04)? see below.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Migrating to gimel5&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mariadb; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
On a compute node&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12916</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12916"/>
		<updated>2020-09-11T12:21:52Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
p.s. Installation of the latest slurm version (20.02.04)? see below.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Migrating to gimel5&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-20.02.4.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=20.02.4; rpmbuild -ta slurm-$VER.tar.bz2 --with mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
On a compute node&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12891</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12891"/>
		<updated>2020-08-31T22:40:23Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Migrating to gimel5&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On a compute node&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&amp;lt;br&amp;gt;&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
&lt;br /&gt;
     - 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12890</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12890"/>
		<updated>2020-08-31T22:39:30Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Migrating to gimel5&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On a compute node&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-20.02.4-1.el7.x86_64.rpm&lt;br /&gt;
yum install rpmbuild/RPMS/x86_64/slurm-slurmd-20.02.4-1.el7.x86_64.rpm&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GPUs specification ====&lt;br /&gt;
- 32-core:&lt;br /&gt;
                + n-9-34 (GTX 1080 Ti)&lt;br /&gt;
                + n-9-36 (GTX 1080 Ti)&lt;br /&gt;
                + n-1-126 (GTX 980)&lt;br /&gt;
                + n-1-141 (GTX 980)&lt;br /&gt;
    - 40-core:&lt;br /&gt;
                + n-1-28 (RTX 2080 Super)&lt;br /&gt;
                + n-1-38 (RTX 2080 Super)&lt;br /&gt;
                + n-1-101 (RTX 2080 Super)&lt;br /&gt;
                + n-1-105 (RTX 2080 Super)&lt;br /&gt;
                + n-1-124 (RTX 2080 Super)&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12886</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12886"/>
		<updated>2020-08-31T17:48:34Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Migrating to gimel5&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12884</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12884"/>
		<updated>2020-08-26T22:05:10Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
=== Submit Jobs with Slurm ===&lt;br /&gt;
&lt;br /&gt;
==== SBATCH-MR (beta) ====&lt;br /&gt;
It is a slurm-version of qsub-mr for submitting job on Slurm queueing system. Note: this is have not been extensively tested yet. Please contact me if the script is not working out. We are hoping to fully migrate to Slurm from the out-dated SGE system. Any error report would be helpful - Khanh&lt;br /&gt;
&lt;br /&gt;
New slurm scripts are located in /nfs/soft/tools/utils/sbatch-slice&lt;br /&gt;
 Just simply replace /nfs/soft/tools/utils/qsub-slice/qsub-mr with /nfs/soft/tools/utils/sbatch-slice/sbatch-mr in your script&lt;br /&gt;
&lt;br /&gt;
To check the status of your job:&lt;br /&gt;
 By username&lt;br /&gt;
  $ squeue -u &amp;lt;username&amp;gt;&lt;br /&gt;
 By jobid&lt;br /&gt;
  $ squeue -j &amp;lt;job_id&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Submit load2d Jobs ====&lt;br /&gt;
&lt;br /&gt;
 $ cd &amp;lt;catalog_shortname&amp;gt;&lt;br /&gt;
 $ source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh&lt;br /&gt;
 (development) $ sh /nfs/exa/work/khtang/submit_scripts/sbatch_slice/batch_zinc21.slurm &amp;lt;catalog_shortname&amp;gt;.ism&lt;br /&gt;
&lt;br /&gt;
==== Submit DOCK Jobs ====&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Slurm Installation Guide ===&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Node master ====&lt;br /&gt;
&#039;&#039;&#039;TBA&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Setup Compute Nodes ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel openssl-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6817/tcp&#039;&#039;   #slurmctld&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;   #slurmd&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==== Node down after reboot ====&lt;br /&gt;
On gimel (master node)&lt;br /&gt;
 sudo scontrol update NodeName=&amp;lt;node_name&amp;gt; State=RESUME&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;br /&gt;
&lt;br /&gt;
[[Category : Slurm]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12671</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12671"/>
		<updated>2020-06-03T20:51:29Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
  Beginning cluster creation for cluster: UCSFbeta&lt;br /&gt;
  Creating stack named: parallelcluster-UCSFbeta&lt;br /&gt;
  Status: ComputeFleet - CREATE_COMPLETE                                          &lt;br /&gt;
  Status: parallelcluster-UCSFbeta - CREATE_COMPLETE                              &lt;br /&gt;
  ClusterUser: centos&lt;br /&gt;
  MasterPrivateIP: 172.31.0.25&lt;br /&gt;
&lt;br /&gt;
Your cluster is ready to go!&lt;br /&gt;
In EC2-&amp;gt;Instances-&amp;gt;Instance you will see your master node awaiting for jobs.&amp;lt;br&amp;gt;&lt;br /&gt;
Now you&#039;ve got two launch template: one is for the master node, another - for computing nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
You can modify them here: EC2-&amp;gt;Instances-&amp;gt;Launche Templates&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In EC2-&amp;gt;Auto Scaling-&amp;gt;Auto Scaling Groups you can modify your cluster shape parameters,&amp;lt;br&amp;gt;&lt;br /&gt;
i.e., min size, max size, desired size, default cooldown (when to start terminating idle compute nodee), etc...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To connect to your master node via SSH, do similar to &#039;&#039;ssh -i &amp;quot;YOUR_PRIVATE_KEY.pem&amp;quot; centos@ec2-54-89-150-98.compute-1.amazonaws.com&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
As it can be seen via sinfo -lNe, there are no computing resources available (smart saving mode).&amp;lt;br&amp;gt;&lt;br /&gt;
In order to bring the compute nodes up, it is sufficient to ask even a simple line: &#039;&#039;srun -n4 hostname&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Answer ---&amp;gt;srun: Required node not available (down, drained or reserved)&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then what happens: within 1 minute jobwatcher will notice that there are jobs in the queue.&amp;lt;br&amp;gt;&lt;br /&gt;
The system will bring up extra resources (up to maxsize parameter) and queue will start computing.&amp;lt;br&amp;gt;&lt;br /&gt;
Should the compute nodes become idle, the system will terminate the compute nodes(only those, which are idle longer than &amp;quot;cooldown time&amp;quot;).&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When the nodes are brought up, one will them in the list of available resources: &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
  Tue Jun  2 14:20:55 2020&lt;br /&gt;
  NODELIST          NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
  ip-172-31-16-148      1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-18-22       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-22-52       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-29-18       1  compute*        idle    1    1:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
Running a job for 4 cpus:&lt;br /&gt;
&lt;br /&gt;
  [centos@ip-172-31-0-25 ~]$ srun -n4 hostname&lt;br /&gt;
  ip-172-31-25-53&lt;br /&gt;
  ip-172-31-18-252&lt;br /&gt;
  ip-172-31-23-39&lt;br /&gt;
  ip-172-31-16-128&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://github.com/aws-samples/aws-plugin-for-slurm&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12670</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12670"/>
		<updated>2020-06-03T20:48:29Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
 https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12669</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12669"/>
		<updated>2020-06-03T20:47:01Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful link:&#039;&#039;&#039;&lt;br /&gt;
 https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
 https://wiki.fysik.dtu.dk/niflheim/Slurm_installation&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12668</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12668"/>
		<updated>2020-06-02T16:42:20Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
  Beginning cluster creation for cluster: UCSFbeta&lt;br /&gt;
  Creating stack named: parallelcluster-UCSFbeta&lt;br /&gt;
  Status: ComputeFleet - CREATE_COMPLETE                                          &lt;br /&gt;
  Status: parallelcluster-UCSFbeta - CREATE_COMPLETE                              &lt;br /&gt;
  ClusterUser: centos&lt;br /&gt;
  MasterPrivateIP: 172.31.0.25&lt;br /&gt;
&lt;br /&gt;
Your cluster is ready to go!&lt;br /&gt;
In EC2-&amp;gt;Instances-&amp;gt;Instance you will see your master node awaiting for jobs.&amp;lt;br&amp;gt;&lt;br /&gt;
Now you&#039;ve got two launch template: one is for the master node, another - for computing nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
You can modify them here: EC2-&amp;gt;Instances-&amp;gt;Launche Templates&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In EC2-&amp;gt;Auto Scaling-&amp;gt;Auto Scaling Groups you can modify your cluster shape parameters,&amp;lt;br&amp;gt;&lt;br /&gt;
i.e., min size, max size, desired size, default cooldown (when to start terminating idle compute nodee), etc...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To connect to your master node via SSH, do similar to &#039;&#039;ssh -i &amp;quot;YOUR_PRIVATE_KEY.pem&amp;quot; centos@ec2-54-89-150-98.compute-1.amazonaws.com&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
As it can be seen via sinfo -lNe, there are no computing resources available (smart saving mode).&amp;lt;br&amp;gt;&lt;br /&gt;
In order to bring the compute nodes up, it is sufficient to ask even a simple line: &#039;&#039;srun -n4 hostname&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Answer ---&amp;gt;srun: Required node not available (down, drained or reserved)&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then what happens: within 1 minute jobwatcher will notice that there are jobs in the queue.&amp;lt;br&amp;gt;&lt;br /&gt;
The system will bring up extra resources (up to maxsize parameter) and queue will start computing.&amp;lt;br&amp;gt;&lt;br /&gt;
Should the compute nodes become idle, the system will terminate the compute nodes(only those, which are idle longer than &amp;quot;cooldown time&amp;quot;).&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When the nodes are brought up, one will them in the list of available resources: &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
  Tue Jun  2 14:20:55 2020&lt;br /&gt;
  NODELIST          NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
  ip-172-31-16-148      1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-18-22       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-22-52       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-29-18       1  compute*        idle    1    1:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
Running a job for 4 cpus:&lt;br /&gt;
&lt;br /&gt;
  [centos@ip-172-31-0-25 ~]$ srun -n4 hostname&lt;br /&gt;
  ip-172-31-25-53&lt;br /&gt;
  ip-172-31-18-252&lt;br /&gt;
  ip-172-31-23-39&lt;br /&gt;
  ip-172-31-16-128&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12667</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12667"/>
		<updated>2020-06-02T14:50:14Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
  Beginning cluster creation for cluster: UCSFbeta&lt;br /&gt;
  Creating stack named: parallelcluster-UCSFbeta&lt;br /&gt;
  Status: ComputeFleet - CREATE_COMPLETE                                          &lt;br /&gt;
  Status: parallelcluster-UCSFbeta - CREATE_COMPLETE                              &lt;br /&gt;
  ClusterUser: centos&lt;br /&gt;
  MasterPrivateIP: 172.31.0.25&lt;br /&gt;
&lt;br /&gt;
Your cluster is ready to go!&lt;br /&gt;
In EC2-&amp;gt;Instances-&amp;gt;Instance you will see your master node awaiting for jobs.&amp;lt;br&amp;gt;&lt;br /&gt;
Now you&#039;ve got two launch template: one is for the master node, another - for computing nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
You can modify them here: EC2-&amp;gt;Instances-&amp;gt;Launche Templates&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In EC2-&amp;gt;Auto Scaling-&amp;gt;Auto Scaling Groups you can modify your cluster shape parameters,&amp;lt;br&amp;gt;&lt;br /&gt;
i.e., min size, max size, desired size, default cooldown (when to start terminating idle compute nodee), etc...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To connect to your master node via SSH, do similar to &#039;&#039;ssh -i &amp;quot;YOUR_PRIVATE_KEY.pem&amp;quot; centos@ec2-54-89-150-98.compute-1.amazonaws.com&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
As it can be seen via sinfo -lNe, there are no computing resources available (smart saving mode).&amp;lt;br&amp;gt;&lt;br /&gt;
In order to bring the compute nodes up, it is sufficient to ask even a simple line: &#039;&#039;srun -n4 hostname&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Answer ---&amp;gt;srun: Required node not available (down, drained or reserved)&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then what happens: within 1 minute jobwatcher will notice that there are jobs in the queue.&amp;lt;br&amp;gt;&lt;br /&gt;
The system will bring up extra resources (up to maxsize parameter) and queue will start computing.&amp;lt;br&amp;gt;&lt;br /&gt;
Should the compute nodes become idle, the system will terminate the compute nodes(only those, which are idle longer than &amp;quot;cooldown time&amp;quot;).&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When the nodes are brought up, one will them in the list of available resources: &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
  Tue Jun  2 14:20:55 2020&lt;br /&gt;
  NODELIST          NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
  ip-172-31-16-148      1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-18-22       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-22-52       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-29-18       1  compute*        idle    1    1:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
Running a job for 4 cpus:&lt;br /&gt;
&lt;br /&gt;
  [centos@ip-172-31-0-25 ~]$ srun -n4 hostname&lt;br /&gt;
  ip-172-31-25-53&lt;br /&gt;
  ip-172-31-18-252&lt;br /&gt;
  ip-172-31-23-39&lt;br /&gt;
  ip-172-31-16-128&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12666</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12666"/>
		<updated>2020-06-02T14:49:32Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
  Beginning cluster creation for cluster: UCSFbeta&lt;br /&gt;
  Creating stack named: parallelcluster-UCSFbeta&lt;br /&gt;
  Status: ComputeFleet - CREATE_COMPLETE                                          &lt;br /&gt;
  Status: parallelcluster-UCSFbeta - CREATE_COMPLETE                              &lt;br /&gt;
  ClusterUser: centos&lt;br /&gt;
  MasterPrivateIP: 172.31.0.25&lt;br /&gt;
&lt;br /&gt;
Your cluster is ready to go!&lt;br /&gt;
In EC2-&amp;gt;Instances-&amp;gt;Instance you will see your master node awaiting for jobs.&amp;lt;br&amp;gt;&lt;br /&gt;
Now you&#039;ve got two launch template: one is for the master node, another - for computing nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
You can modify them here: EC2-&amp;gt;Instances-&amp;gt;Launche Templates&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In EC2-&amp;gt;Auto Scaling-&amp;gt;Auto Scaling Groups you can modify your cluster shape parameters,&amp;lt;br&amp;gt;&lt;br /&gt;
i.e., min size, max size, desired size, default cooldown (when to start terminating idle compute nodee), etc...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To connect to your master node via SSH, do similar to &#039;&#039;ssh -i &amp;quot;YOUR_PRIVATE_KEY.pem&amp;quot; centos@ec2-54-89-150-98.compute-1.amazonaws.com&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
As it can be seen via sinfo -lNe, there are no computing resources available (smart saving mode).&amp;lt;br&amp;gt;&lt;br /&gt;
In order to bring the compute nodes up, it is sufficient to ask even a simple line: &#039;&#039;srun -n4 hostname&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Answer ---&amp;gt;srun: Required node not available (down, drained or reserved)&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then what happens: within 1 minute jobwatcher will notice that there are jobs in the queue.&amp;lt;br&amp;gt;&lt;br /&gt;
The system will bring up extra resources (up to maxsize parameter) and queue will start computing.&amp;lt;br&amp;gt;&lt;br /&gt;
Should the compute nodes become idle, the system will terminate the compute nodes(only those, which are idle longer than &amp;quot;cooldown time&amp;quot;).&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When the nodes are brought up, one will them in the list of available resources:&lt;br /&gt;
  &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
  Tue Jun  2 14:20:55 2020&lt;br /&gt;
  NODELIST          NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
  ip-172-31-16-148      1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-18-22       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-22-52       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-29-18       1  compute*        idle    1    1:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
Running a job for 4 cpus:&lt;br /&gt;
&lt;br /&gt;
  [centos@ip-172-31-0-25 ~]$ srun -n4 hostname&lt;br /&gt;
  ip-172-31-25-53&lt;br /&gt;
  ip-172-31-18-252&lt;br /&gt;
  ip-172-31-23-39&lt;br /&gt;
  ip-172-31-16-128&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12665</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12665"/>
		<updated>2020-06-02T14:42:43Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
  Beginning cluster creation for cluster: UCSFbeta&lt;br /&gt;
  Creating stack named: parallelcluster-UCSFbeta&lt;br /&gt;
  Status: ComputeFleet - CREATE_COMPLETE                                          &lt;br /&gt;
  Status: parallelcluster-UCSFbeta - CREATE_COMPLETE                              &lt;br /&gt;
  ClusterUser: centos&lt;br /&gt;
  MasterPrivateIP: 172.31.0.25&lt;br /&gt;
&lt;br /&gt;
Your cluster is ready to go!&lt;br /&gt;
In EC2-&amp;gt;Instances-&amp;gt;Instance you will see your master node awaiting for jobs.&amp;lt;br&amp;gt;&lt;br /&gt;
Now you&#039;ve got two launch template: one is for the master node, another - for computing nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
You can modify them here: EC2-&amp;gt;Instances-&amp;gt;Launche Templates&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In EC2-&amp;gt;Auto Scaling-&amp;gt;Auto Scaling Groups you can modify your cluster shape parameters,&amp;lt;br&amp;gt;&lt;br /&gt;
i.e., min size, max size, desired size, default cooldown (when to start terminating idle compute nodee), etc...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To connect to your master node via SSH, do similar to &#039;&#039;ssh -i &amp;quot;YOUR_PRIVATE_KEY.pem&amp;quot; centos@ec2-54-89-150-98.compute-1.amazonaws.com&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
As it can be seen via sinfo -lNe, there are no computing resources available (smart saving mode).&amp;lt;br&amp;gt;&lt;br /&gt;
In order to bring the compute nodes up, it is sufficient to ask even a simple line: &#039;&#039;srun -n4 hostname&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Answer ---&amp;gt;srun: Required node not available (down, drained or reserved)&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then what happens: within 1 minute jobwatcher will notice that there are jobs in the queue.&amp;lt;br&amp;gt;&lt;br /&gt;
The system will bring up extra resources (up to maxsize parameter) and queue will start computing.&amp;lt;br&amp;gt;&lt;br /&gt;
Should the compute nodes become idle, the system will terminate the compute nodes(only those, which are idle longer than &amp;quot;cooldown time&amp;quot;).&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
  When the nodes are brought up, one will them in the list of available resources:&lt;br /&gt;
  &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
  Tue Jun  2 14:20:55 2020&lt;br /&gt;
  NODELIST          NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
  ip-172-31-16-148      1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-18-22       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-22-52       1  compute*        idle    1    1:1:1      1        0      1   (null) none                &lt;br /&gt;
  ip-172-31-29-18       1  compute*        idle    1    1:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
  &#039;&#039;srun -n4 hostname&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12664</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12664"/>
		<updated>2020-06-02T14:28:34Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
  Beginning cluster creation for cluster: UCSFbeta&lt;br /&gt;
  Creating stack named: parallelcluster-UCSFbeta&lt;br /&gt;
  Status: ComputeFleet - CREATE_COMPLETE                                          &lt;br /&gt;
  Status: parallelcluster-UCSFbeta - CREATE_COMPLETE                              &lt;br /&gt;
  ClusterUser: centos&lt;br /&gt;
  MasterPrivateIP: 172.31.0.25&lt;br /&gt;
&lt;br /&gt;
Your cluster is ready to go!&lt;br /&gt;
In EC2-&amp;gt;Instances-&amp;gt;Instance you will see your master node awaiting for jobs.&amp;lt;br&amp;gt;&lt;br /&gt;
Now you&#039;ve got two launch template: one is for the master node, another - for computing nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
You can modify them here: EC2-&amp;gt;Instances-&amp;gt;Launche Templates&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In EC2-&amp;gt;Auto Scaling-&amp;gt;Auto Scaling Groups you can modify your cluster shape parameters,&amp;lt;br&amp;gt;&lt;br /&gt;
i.e., min size, max size, desired size, default cooldown (when to start terminating idle compute nodee), etc...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To connect to your master node via SSH, do similar to &#039;&#039;ssh -i &amp;quot;YOUR_PRIVATE_KEY.pem&amp;quot; centos@ec2-54-89-150-98.compute-1.amazonaws.com&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
As it can be seen via sinfo -lNe, there are no computing resources available (smart saving mode).&amp;lt;br&amp;gt;&lt;br /&gt;
In order to bring the compute nodes up, it is sufficient to ask even a simple line: &#039;&#039;srun -n4 hostname&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Answer ---&amp;gt;srun: Required node not available (down, drained or reserved)&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then what happens: within 1 minute jobwatcher will notice that there are jobs in the queue.&amp;lt;br&amp;gt;&lt;br /&gt;
The system will bring up extra resources (up to maxsize parameter) and queue will start computing.&amp;lt;br&amp;gt;&lt;br /&gt;
Should the compute nodes become idle, the system will terminate the compute nodes(only those, which are idle longer than &amp;quot;cooldown time&amp;quot;).&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12663</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12663"/>
		<updated>2020-06-02T14:27:31Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
  Beginning cluster creation for cluster: UCSFbeta&lt;br /&gt;
  Creating stack named: parallelcluster-UCSFbeta&lt;br /&gt;
  Status: ComputeFleet - CREATE_COMPLETE                                          &lt;br /&gt;
  Status: parallelcluster-UCSFbeta - CREATE_COMPLETE                              &lt;br /&gt;
  ClusterUser: centos&lt;br /&gt;
  MasterPrivateIP: 172.31.0.25&lt;br /&gt;
&lt;br /&gt;
Your cluster is ready to go!&lt;br /&gt;
In EC2-&amp;gt;Instances-&amp;gt;Instance you will see your master node awaiting for jobs.&amp;lt;br&amp;gt;&lt;br /&gt;
Now you&#039;ve got two launch template: one is for the master node, another - for computing nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
You can modify them here: EC2-&amp;gt;Instances-&amp;gt;Launche Templates&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In EC2-&amp;gt;Auto Scaling-&amp;gt;Auto Scaling Groups you can modify your cluster shape parameters, i.e., min size, max size, desired size, default cooldown (when to start terminating idle compute nodee), etc...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To connect to your master node via SSH, do similar to &#039;&#039;ssh -i &amp;quot;YOUR_PRIVATE_KEY.pem&amp;quot; centos@ec2-54-89-150-98.compute-1.amazonaws.com&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
As it can be seen via sinfo -lNe, there are no computing resources available (smart saving mode).&lt;br /&gt;
In order to bring the compute nodes up, it is sufficient to ask even a simple line: srun -n4 hostname&lt;br /&gt;
Answer ---&amp;gt;srun: Required node not available (down, drained or reserved)&lt;br /&gt;
&lt;br /&gt;
Then what happens: within 1 minute jobwatcher will notice that there are jobs in the queue.&amp;lt;br&amp;gt;&lt;br /&gt;
The system will bring up extra resources (up to maxsize parameter) and queue will start computing.&amp;lt;br&amp;gt;&lt;br /&gt;
Should the compute nodes become idle, the system will terminate the compute nodes(only those, which are idle longer than &amp;quot;cooldown time&amp;quot;).&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12662</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12662"/>
		<updated>2020-06-02T14:15:43Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
  Beginning cluster creation for cluster: UCSFbeta&lt;br /&gt;
  Creating stack named: parallelcluster-UCSFbeta&lt;br /&gt;
  Status: ComputeFleet - CREATE_COMPLETE                                          &lt;br /&gt;
  Status: parallelcluster-UCSFbeta - CREATE_COMPLETE                              &lt;br /&gt;
  ClusterUser: centos&lt;br /&gt;
  MasterPrivateIP: 172.31.0.25&lt;br /&gt;
&lt;br /&gt;
Your cluster is ready to go!&lt;br /&gt;
In EC2-&amp;gt;Instances-&amp;gt;Instance you will see your master node awaiting for jobs.&amp;lt;br&amp;gt;&lt;br /&gt;
Now you&#039;ve got two launch template: one is for the master node, another - for computing nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
You can modify them here: EC2-&amp;gt;Instances-&amp;gt;Launche Templates&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In EC2-&amp;gt;Auto Scaling-&amp;gt;Auto Scaling Groups you can modify your cluster shape parameters, i.e., min size, max size, desired size, default cooldown (when to start terminating idle compute nodee), etc...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To connect to your master node via SSH, do similar to &#039;&#039;ssh -i &amp;quot;YOUR_PRIVATE_KEY.pem&amp;quot; centos@ec2-54-89-150-98.compute-1.amazonaws.com&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12661</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12661"/>
		<updated>2020-06-02T13:59:19Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
  Beginning cluster creation for cluster: UCSFbeta&lt;br /&gt;
  Creating stack named: parallelcluster-UCSFbeta&lt;br /&gt;
  Status: ComputeFleet - CREATE_COMPLETE                                          &lt;br /&gt;
  Status: parallelcluster-UCSFbeta - CREATE_COMPLETE                              &lt;br /&gt;
  ClusterUser: centos&lt;br /&gt;
  MasterPrivateIP: 172.31.0.25&lt;br /&gt;
&lt;br /&gt;
Your cluster is ready to go!&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12660</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12660"/>
		<updated>2020-06-02T13:57:23Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]:   &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Maximum cluster size (instances) [10]:  &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Master instance type [t2.micro]:        &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Compute instance type [t2.micro]:       &amp;lt;------- THIS CAN BE CHANGED LATER&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12659</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12659"/>
		<updated>2020-06-02T13:51:51Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]: &lt;br /&gt;
  Maximum cluster size (instances) [10]: &lt;br /&gt;
  Master instance type [t2.micro]: &lt;br /&gt;
  Compute instance type [t2.micro]:&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The config file is ready and stored in ~/.parallelcluster/config&amp;lt;br&amp;gt;&lt;br /&gt;
You may revise it and edit, if needed.&lt;br /&gt;
&lt;br /&gt;
To create a cluster on AWS, do  pcluster create -c ~/.parallelcluster/config UCSFbeta&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12658</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12658"/>
		<updated>2020-06-02T13:46:27Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
  Allowed values for AWS Region ID:&lt;br /&gt;
  1. ap-northeast-1&lt;br /&gt;
  2. ap-northeast-2&lt;br /&gt;
  3. ap-south-1&lt;br /&gt;
  4. ap-southeast-1&lt;br /&gt;
  5. ap-southeast-2&lt;br /&gt;
  6. ca-central-1&lt;br /&gt;
  7. eu-central-1&lt;br /&gt;
  8. eu-north-1&lt;br /&gt;
  9. eu-west-1&lt;br /&gt;
  10. eu-west-2&lt;br /&gt;
  11. eu-west-3&lt;br /&gt;
  12. sa-east-1&lt;br /&gt;
  13. us-east-1&lt;br /&gt;
  14. us-east-2&lt;br /&gt;
  15. us-west-1&lt;br /&gt;
  16. us-west-2&lt;br /&gt;
  AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
&lt;br /&gt;
  Allowed values for EC2 Key Pair Name:&lt;br /&gt;
  1. EC2_v1&lt;br /&gt;
  EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Allowed values for Scheduler:&lt;br /&gt;
  1. sge&lt;br /&gt;
  2. torque&lt;br /&gt;
  3. slurm&lt;br /&gt;
  4. awsbatch&lt;br /&gt;
  Scheduler [slurm]:&lt;br /&gt;
  Minimum cluster size (instances) [0]: &lt;br /&gt;
  Maximum cluster size (instances) [10]: &lt;br /&gt;
  Master instance type [t2.micro]: &lt;br /&gt;
  Compute instance type [t2.micro]:&lt;br /&gt;
  Automate VPC creation? (y/n) [n]: &lt;br /&gt;
  Allowed values for VPC ID:&lt;br /&gt;
  1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
  VPC ID [vpc-579d8e2d]: &lt;br /&gt;
  Allowed values for Network Configuration:&lt;br /&gt;
  1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
  2. Master and compute fleet in the same public subnet&lt;br /&gt;
  Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12657</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12657"/>
		<updated>2020-06-02T13:42:54Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key.&amp;lt;br&amp;gt;&lt;br /&gt;
Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&amp;lt;br&amp;gt;&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
  this will be stored in .aws/credentials&lt;br /&gt;
&lt;br /&gt;
e) parallel cluster configuration&lt;br /&gt;
NB: We use AWS Region us-east-1, which corresponds to N.Virginia.&amp;lt;br&amp;gt;&lt;br /&gt;
You are welcome to re-consider this choice. &lt;br /&gt;
&lt;br /&gt;
  pcluster configure&lt;br /&gt;
&lt;br /&gt;
Allowed values for AWS Region ID:&lt;br /&gt;
1. ap-northeast-1&lt;br /&gt;
2. ap-northeast-2&lt;br /&gt;
3. ap-south-1&lt;br /&gt;
4. ap-southeast-1&lt;br /&gt;
5. ap-southeast-2&lt;br /&gt;
6. ca-central-1&lt;br /&gt;
7. eu-central-1&lt;br /&gt;
8. eu-north-1&lt;br /&gt;
9. eu-west-1&lt;br /&gt;
10. eu-west-2&lt;br /&gt;
11. eu-west-3&lt;br /&gt;
12. sa-east-1&lt;br /&gt;
13. us-east-1&lt;br /&gt;
14. us-east-2&lt;br /&gt;
15. us-west-1&lt;br /&gt;
16. us-west-2&lt;br /&gt;
AWS Region ID [us-east-1]:&lt;br /&gt;
&lt;br /&gt;
Network &amp;amp; Security -&amp;gt; Key Pairs -&amp;gt; Create New Pair&lt;br /&gt;
Allowed values for EC2 Key Pair Name:&lt;br /&gt;
1. EC2_v1&lt;br /&gt;
EC2 Key Pair Name [EC2_v1]:&lt;br /&gt;
Allowed values for Scheduler:&lt;br /&gt;
1. sge&lt;br /&gt;
2. torque&lt;br /&gt;
3. slurm&lt;br /&gt;
4. awsbatch&lt;br /&gt;
Scheduler [slurm]:&lt;br /&gt;
Allowed values for Scheduler:&lt;br /&gt;
1. sge&lt;br /&gt;
2. torque&lt;br /&gt;
3. slurm&lt;br /&gt;
4. awsbatch&lt;br /&gt;
Scheduler [slurm]:&lt;br /&gt;
Minimum cluster size (instances) [0]: &lt;br /&gt;
Maximum cluster size (instances) [10]: &lt;br /&gt;
Master instance type [t2.micro]: &lt;br /&gt;
Compute instance type [t2.micro]:&lt;br /&gt;
Automate VPC creation? (y/n) [n]: &lt;br /&gt;
Allowed values for VPC ID:&lt;br /&gt;
1. vpc-579d8e2d | 0 subnets inside&lt;br /&gt;
VPC ID [vpc-579d8e2d]: &lt;br /&gt;
Allowed values for Network Configuration:&lt;br /&gt;
1. Master in a public subnet and compute fleet in a private subnet&lt;br /&gt;
2. Master and compute fleet in the same public subnet&lt;br /&gt;
Network Configuration [Master in a public subnet and compute fleet in a private subnet]: 1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12656</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12656"/>
		<updated>2020-06-02T13:21:59Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
a) upgrade your pip:&lt;br /&gt;
pip install --upgrade pip&lt;br /&gt;
&lt;br /&gt;
b) install aws client:&lt;br /&gt;
pip install awscli&lt;br /&gt;
&lt;br /&gt;
c) pip install aws-parallelcluster&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
d) prepare your aws_access_key_id and aws_secret_access_key. Those can be found in &amp;quot;My Security Credentials -&amp;gt; Access keys (access key ID and secret access key)&amp;quot; section.&lt;br /&gt;
If you haven&#039;t got it yet, press &amp;quot;Create New Access Key&amp;quot; and follow the instructions.&lt;br /&gt;
&lt;br /&gt;
  aws configure&lt;br /&gt;
  Access Key ID [None]: _YOUR_ACCESS_KEY_ID_&lt;br /&gt;
  AWS Secret Access Key [None]: _YOUR_SECRET_ACCESS_KEY_&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12655</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12655"/>
		<updated>2020-06-01T19:33:41Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful links:&#039;&#039;&#039;&lt;br /&gt;
* https://support.lumerical.com/hc/en-us/articles/360034082194-AWS-Use-Case-Multi-node-clusters-with-launch-templates&lt;br /&gt;
* https://github.com/aws/aws-parallelcluster&lt;br /&gt;
* https://aws-parallelcluster.readthedocs.io/en/latest/&lt;br /&gt;
* https://docs.aws.amazon.com/parallelcluster/index.html&lt;br /&gt;
* https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12654</id>
		<title>AWS Auto Scaling</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=AWS_Auto_Scaling&amp;diff=12654"/>
		<updated>2020-06-01T19:22:44Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: Created page with &amp;quot;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A step-by-step instruction how to create a slurm cluster on AWS with auto-scaling possibility&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12636</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12636"/>
		<updated>2020-05-28T21:32:37Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm user-guide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12635</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12635"/>
		<updated>2020-05-28T21:27:15Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12634</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12634"/>
		<updated>2020-05-28T21:26:02Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
* CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
* extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12633</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12633"/>
		<updated>2020-05-28T21:25:11Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
Ultimately, you may need to compare your results with the reference run:&lt;br /&gt;
CHEMBL4422_active_ligands.mol2 - TOP500 scoring poses &lt;br /&gt;
extract_all.sort.uniq.txt - a print-out of scoring details&lt;br /&gt;
&lt;br /&gt;
Slurm queue manager is installed locally at gimel, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful DOCKING commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12632</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12632"/>
		<updated>2020-05-28T21:16:41Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -20&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12631</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12631"/>
		<updated>2020-05-28T21:09:22Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
   4187_[637-2091]     gimel array_jo  dudenko PD       0:00      1 (Resources)&lt;br /&gt;
          4187_629     gimel array_jo  dudenko  R       0:00      1 n-1-20&lt;br /&gt;
          4187_630     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_631     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_632     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_633     gimel array_jo  dudenko  R       0:00      1 n-1-21&lt;br /&gt;
          4187_634     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_635     gimel array_jo  dudenko  R       0:00      1 n-5-35&lt;br /&gt;
          4187_636     gimel array_jo  dudenko  R       0:00      1 n-5-34&lt;br /&gt;
          4187_622     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_623     gimel array_jo  dudenko  R       0:01      1 n-5-34&lt;br /&gt;
          4187_624     gimel array_jo  dudenko  R       0:01      1 n-5-35&lt;br /&gt;
          4187_625     gimel array_jo  dudenko  R       0:01      1 n-1-17&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12630</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12630"/>
		<updated>2020-05-28T20:41:06Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12629</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12629"/>
		<updated>2020-05-28T20:40:29Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so &#039;&#039;sinfo -lNe&#039;&#039; and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12628</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12628"/>
		<updated>2020-05-28T20:39:20Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* enabling and starting slurm computing nodes (Centos 7): &#039;&#039;systemctl enable slurmd; systemctl start slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so sinfo -lNe and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12627</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12627"/>
		<updated>2020-05-28T14:50:45Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: &#039;&#039;export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* restarting slurm computing nodes (Centos 7): &#039;&#039;systemctl restart slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so sinfo -lNe and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12626</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12626"/>
		<updated>2020-05-28T14:50:14Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;GPR40 example:&#039;&#039;&#039; /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;ChEMBL ligands:&#039;&#039;&#039; /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* restarting slurm computing nodes (Centos 7): &#039;&#039;systemctl restart slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so sinfo -lNe and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12625</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12625"/>
		<updated>2020-05-28T14:49:30Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Running DOCK-3.7 with Slurm&#039;&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
GPR40 example: /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
ChEMBL ligands: /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* restarting slurm computing nodes (Centos 7): &#039;&#039;systemctl restart slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so sinfo -lNe and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12624</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12624"/>
		<updated>2020-05-28T14:48:53Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
After the installation is completed, you need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Running DOCK-3.7 with Slurm&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
GPR40 example: /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
ChEMBL ligands: /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* restarting slurm computing nodes (Centos 7): &#039;&#039;systemctl restart slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so sinfo -lNe and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12623</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12623"/>
		<updated>2020-05-28T14:47:42Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
You may need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Running DOCK-3.7 with Slurm&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
GPR40 example: /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
ChEMBL ligands: /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction (for sysadmins only)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* restarting slurm computing nodes (Centos 7): &#039;&#039;systemctl restart slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so sinfo -lNe and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12622</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12622"/>
		<updated>2020-05-28T14:46:49Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
You may need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Running DOCK-3.7 with Slurm&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
GPR40 example: /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
ChEMBL ligands: /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&amp;lt;br&amp;gt;&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&amp;lt;br&amp;gt;&lt;br /&gt;
Do not forget to set DOCKBASE variable: export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* restarting slurm computing nodes (Centos 7): &#039;&#039;systemctl restart slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so sinfo -lNe and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12621</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12621"/>
		<updated>2020-05-28T14:46:10Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
You may need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Running DOCK-3.7 with Slurm&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
GPR40 example: /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
ChEMBL ligands: /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&lt;br /&gt;
&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&lt;br /&gt;
Do not forget to set DOCKBASE variable: export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful commands to remind:&#039;&#039;&#039;&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&#039;&#039;&#039;&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* restarting slurm computing nodes (Centos 7): &#039;&#039;systemctl restart slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so sinfo -lNe and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
	<entry>
		<id>http://wiki.docking.org/index.php?title=Slurm&amp;diff=12620</id>
		<title>Slurm</title>
		<link rel="alternate" type="text/html" href="http://wiki.docking.org/index.php?title=Slurm&amp;diff=12620"/>
		<updated>2020-05-28T14:45:08Z</updated>

		<summary type="html">&lt;p&gt;Dudenko: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Slurm userguide&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Useful libraries and utilities on master node (gimel)&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* ANACONDA Installation (Python 2.7)&lt;br /&gt;
&lt;br /&gt;
Each user is welcome to download anaconda and install into his/her own folder&amp;lt;br&amp;gt;&lt;br /&gt;
https://www.anaconda.com/distribution/&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;wget https://repo.anaconda.com/archive/Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
NB: It is also available for Python3, which is our nearest future&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
simple installation via &#039;&#039;/bin/sh Anaconda2-2019.10-Linux-x86_64.sh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
You may need to install a few packages:&lt;br /&gt;
 conda install -c free bsddb&lt;br /&gt;
 conda install -c rdkit rdkit&lt;br /&gt;
 conda install numpy&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Running DOCK-3.7 with Slurm&lt;br /&gt;
Here is a “guinea pig project”, which has been done with DOCK-3.7 locally.&amp;lt;br&amp;gt;&lt;br /&gt;
GPR40 example: /mnt/nfs/home/dudenko/TEST_DOCKING_PROJECT&amp;lt;br&amp;gt;&lt;br /&gt;
ChEMBL ligands: /mnt/nfs/home/dudenko/CHEMBL4422_active_ligands&lt;br /&gt;
&lt;br /&gt;
This test calculation should run smoothly, if not, then there is a problem.&lt;br /&gt;
&lt;br /&gt;
Slurm queue is installed locally, use it to run this test (and all your future jobs) in parallel.&lt;br /&gt;
Do not forget to set DOCKBASE variable: export DOCKBASE=/nfs/soft/dock/versions/dock37/DOCK-3.7.3rc1/&lt;br /&gt;
&lt;br /&gt;
# Useful commands to remind:&lt;br /&gt;
&lt;br /&gt;
 $DOCKBASE/docking/setup/setup_db2_zinc15_file_number.py ./ CHEMBL4422_active_ligands_ CHEMBL4422_active_ligands.sdi 100 count&lt;br /&gt;
 $DOCKBASE/analysis/extract_all.py -s -10&lt;br /&gt;
 $DOCKBASE/analysis/getposes.py -l 500 -o CHEMBL4422_active_ligands.mol2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# Useful slurm commands (see https://slurm.schedmd.com/quickstart.html):&lt;br /&gt;
 to see what machine resources are offered by the cluster, do &#039;&#039;sinfo -lNe&#039;&#039;&lt;br /&gt;
 to submit a DOCK-3.7 job, run &#039;&#039;$DOCKBASE/docking/submit/submit_slurm_array.csh&#039;&#039;&lt;br /&gt;
 to see what is happening in the queue, run &#039;&#039;squeue&#039;&#039;&lt;br /&gt;
 to see a detailed info for a specific job: &#039;&#039;scontrol show jobid=_JOBID_&#039;&#039;&lt;br /&gt;
 to delete a job from queue, run &#039;&#039;scancel _JOBID_&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Should your slurm run correctly, type &#039;&#039;squeue&#039;&#039; and you should see something like this:&lt;br /&gt;
&lt;br /&gt;
 #### BASH command line output to console&lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
       217_[9-100] pdl-stati array_jo   docker PD       0:00      1 (Resources)&lt;br /&gt;
             217_8 pdl-stati array_jo   docker  R       0:00      1 pdl-station&lt;br /&gt;
             217_5 pdl-stati array_jo   docker  R       0:08      1 pdl-station&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As root at gimel, it is possible to modify a particular job, e.g., &#039;&#039;scontrol update jobid=635 TimeLimit=7-00:00:00&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Detailed step-by-step installation instruction&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Useful link: https://slurm.schedmd.com/quickstart_admin.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;node n-1-17&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* make sure you have there Centos 7: &#039;&#039;cat /etc/redhat-release&#039;&#039;&lt;br /&gt;
* &#039;&#039;wget https://download.schedmd.com/slurm/slurm-17.02.11.tar.bz2&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install readline-devel perl-ExtUtils-MakeMaker.noarch munge-devel pam-devel&#039;&#039;&lt;br /&gt;
* &#039;&#039;export VER=17.02.11; rpmbuild -ta slurm-$VER.tar.bz2 --without mysql; mv /root/rpmbuild .&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
installing built packages from rpmbuild:&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-plugins-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
* &#039;&#039;yum install rpmbuild/RPMS/x86_64/slurm-munge-17.02.11-1.el7.x86_64.rpm&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up munge&#039;&#039;&#039;:&lt;br /&gt;
copy over /etc/munge/munge.key from gimel and put locally to /etc/munge. The key should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Munge is a daemon responsible for secure data exchange between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
Set permissions accordingly: &#039;&#039;chown munge:munge /etc/munge/munge.key; chmod 400 /etc/munge/munge.key&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;starting munge&#039;&#039;&#039;: &#039;&#039;systemctl enable munge; systemctl start munge&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;setting up slurm&#039;&#039;&#039;:&lt;br /&gt;
* create a user slurm: adduser slurm.&lt;br /&gt;
* all UID/GUIDs of slurm user should be identical allover the nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  Otherwise, one needs to specify a mapping scheme for translating each UID/GUIDs between nodes.&amp;lt;br&amp;gt;&lt;br /&gt;
  To edit slurm UID/GUID, do &#039;&#039;vipw&#039;&#039; and replace &amp;quot;slurm line&amp;quot; with slurm:x:XXXXX:YYYYY::/nonexistent:/bin/false&amp;lt;br&amp;gt;&lt;br /&gt;
  XXXXX and YYYYY for slurm user can be found at gimel in /etc/passwd&amp;lt;br&amp;gt;&lt;br /&gt;
  NB: don&#039;t forget to edit /etc/group as well.&amp;lt;br&amp;gt;&lt;br /&gt;
* copy /etc/slurm/slurm.conf from gimel and put locally to /etc/slurm.&lt;br /&gt;
* figure out what CPU/Memory resources you have at n-1-17 (see /proc/cpuinfo) and append the following line:&lt;br /&gt;
  NodeName=n-1-17 NodeAddr=10.20.1.17 CPUs=24 State=UNKNOWN&lt;br /&gt;
* append n-1-17 to the partition: PartitionName=gimel Nodes=gimel,n-5-34,n-5-35,n-1-17 Default=YES MaxTime=INFINITE State=UP&lt;br /&gt;
* create the following folders:&lt;br /&gt;
  &#039;&#039;mkdir -p /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
  &#039;&#039;chown -R slurm:slurm /var/spool/slurm-llnl /var/run/slurm-llnl /var/log/slurm-llnl&#039;&#039;&lt;br /&gt;
* restarting slurm master node at gimel (Centos 6): &#039;&#039;/etc/init.d/slurm restart&#039;&#039;&lt;br /&gt;
* restarting slurm computing nodes (Centos 7): &#039;&#039;systemctl restart slurmd&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
And last but not least, asking the firewall to allow communication between master node and computing node n-1-17:&lt;br /&gt;
* &#039;&#039;firewall-cmd --permanent --zone=public --add-port=6818/tcp&#039;&#039;&lt;br /&gt;
* &#039;&#039;firewall-cmd --reload&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To disable a specific node, do &#039;&#039;scontrol update NodeName=n-1-17 State=DRAIN Reason=DRAINED&#039;&#039;&lt;br /&gt;
To return back to service, do &#039;&#039;scontrol update NodeName=n-1-17 State=IDLE&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
To see the current situation of the queue, so sinfo -lNe and you will see:&lt;br /&gt;
 Wed May 27 09:49:54 2020&lt;br /&gt;
 NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              &lt;br /&gt;
 gimel          1    gimel*     drained   24    4:6:1      1        0      1   (null) none                &lt;br /&gt;
 n-1-17         1    gimel*        idle   24   24:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-34         1    gimel*        idle   80   80:1:1      1        0      1   (null) none                &lt;br /&gt;
 n-5-35         1    gimel*        idle   80   80:1:1      1        0      1   (null) none&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
p.s. Some users/scripts may require csh/tcsh.&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;sudo yum install csh tcsh&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[DOCK_3.7]]&lt;/div&gt;</summary>
		<author><name>Dudenko</name></author>
	</entry>
</feed>