Schrodinger
SCHRODINGER - getting it running
Get a License File:
Get an email about Schrodinger license keys ready for retrieval.
Click the link that follows: "please use this form to generate the license file:"
Cluster 0
In the License Retrieval Assistant, make sure you have the following information for the respective categories:
Host ID: 0015605f526c
Machine Name: nis.compbio.ucsf.edu
FLEXIm Server Port: 2700
Cluster2
Host ID: this_host Machine Name: bet FlexLM Server Port: 27008
Debugging:
Cluster 0, all schrodinger files are located locally on nfshead2:/raid3 but the commands below should be executed on nis as user tdemers.
Make sure that the variable $LM_LICENSE_FILE has port@same_exact_server_name_as_in_license_file. The license.dat file must contain:
SERVER nis.compbio.ucsf.edu 0015605f526c 27000 VENDOR SCHROD PORT=53000
Make sure the port is open in iptables
source /raid3/software/schrodinger/current.sh
Try some combination of the following:
$SCHRODINGER/licadmin STAT -c $SCHRODINGER/license.dat $SCHRODINGER/licadmin REREAD -l $SCHRODINGER/lmgrd.log -c $SCHRODINGER/license.dat $SCHRODINGER/licadmin SERVERDOWN $SCHRODINGER/licadmin SERVERUP -l $SCHRODINGER/lmgrd.log -c $SCHRODINGER/license.dat
Installing Schrodinger on Cluster 0
First you need to go to the website and download the software. You should end up with two files: Schrodinger Worflow … .zip and Schrodinger Suites …..tar scp both these files to the server, to the schrodinger directory. On the server, in the schrodinger directory mkdir MonthYear. cd into that directory Untar the tar file and run the INSTALL script. At the end you’ll see something like this:
*) Licensing You will need one or more licenses before you can run the software you have just installed.
Please note the following information, which you will need in order to generate a license key:
Host ID: 001e0bd543b8 Machine name: nfshead2.bkslab.org
If you are not performing this installation on your license server, you will need the output of:
$SCHRODINGER/machid -hostid
Installing Schrodinger 2019 on Cluster 2
Install
https://www.schrodinger.com/downloads/releases
Select the Linux 64-bit version. Download it to your local computer first. Then scp the tarball over the nfs-soft in the appropriate directory. Extract the tarball and you'll get a bunch of smaller tarfiles.
# ls Schrodinger_Suites_2019-1_Linux-x86_64.tar # tar -xvf Schrodinger_Suites_2019-1_Linux-x86_64.tar Schrodinger_Suites_2019-1_Linux-x86_64/canvas-v3.9-Linux-x86_64.tar.gz Schrodinger_Suites_2019-1_Linux-x86_64/mcpro-v5.3-Linux-x86_64.tar.gz Schrodinger_Suites_2019-1_Linux-x86_64/desmond-v5.7-Linux-x86_64.tar.gz Schrodinger_Suites_2019-1_Linux-x86_64/INSTALL . . . Schrodinger_Suites_2019-1_Linux-x86_64/CHECKSUM.md5
https://www.schrodinger.com/license-installation-instructions
We do not need to untar these individually. The INSTALL script takes care of nearly everything. All we have to do is set the path of where we want the installed programs to go to.
[root@bet ~]# export SCHRODINGER=/export/soft/schrodinger/2019-1/ [root@bet ~]# ./INSTALL
The install script will ask you where you're running your license server. We run the license server on the same server as the installation server so tell the software that it will run on 27008@bet
Set Environment Files
Notice we set the SCHROD_LICENSE_FILE as '27008@bet'. We do not use the FQDN. This is because the desktops are on the public network (compbio.ucsf.edu) while the cluster is on a private network (cluster.ucsf.bkslab.org). If we use the FQDN, the desktops may recognize the domain but not the cluster and vice versa. Therefore, we will reference the license server as simply 'bet'
env.sh
#!/bin/bash export SCHRODINGER="/nfs/soft/schrodinger/2019-1" export SCHRODINGER_THIRDPARTY="$SCHRODINGER/thirdparty" export SCHRODINGER_PDB="$SCHRODINGER_THIRDPARTY/database/pdb" export SCHRODINGER_UTILITIES="$SCHRODINGER/utilities" export SCHRODINGER_RCP="scp" export SCHRODINGER_RSH="ssh" export PSP_BLASTDB="$SCHRODINGER_THIRDPARTY/database/blast/" export PSP_BLAST_DATA="$SCHRODINGER_THIRDPARTY/bin/Linux-x86/blast/data/" export PSP_BLAST_DIR="$SCHRODINGER_THIRDPARTY/bin/Linux-x86/blast/" export SCHROD_LICENSE_FILE="27008@bet" export LM_LICENSE_FILE="27008@bet" export PATH="${SCHRODINGER}:${SCHRODINGER_UTILITIES}:${PATH}:${SCHRODINGER_THIRDPARTY}/desmond_to_trj"
env.csh
#!/bin/csh setenv SCHRODINGER "/mnt/nfs/soft/schrodinger/2019-1" setenv SCHRODINGER_THIRDPARTY "$SCHRODINGER/thirdparty" setenv SCHRODINGER_PDB "$SCHRODINGER_THIRDPARTY/database/pdb" setenv SCHRODINGER_UTILITIES "$SCHRODINGER/utilities" setenv SCHRODINGER_RCP "scp" setenv SCHRODINGER_RSH "ssh" setenv PSP_BLASTDB "$SCHRODINGER_THIRDPARTY/database/blast/" setenv PSP_BLAST_DATA "$SCHRODINGER_THIRDPARTY/bin/Linux-x86/blast/data/" setenv PSP_BLAST_DIR "$SCHRODINGER_THIRDPARTY/bin/Linux-x86/blast/" setenv SCHROD_LICENSE_FILE "27008@bet" setenv PATH "${SCHRODINGER}:${SCHRODINGER_UTILITIES}:${PATH}:${SCHRODINGER_THIRDPARTY}/desmond_to_trj"
Licensing
Edit the license file line that contains 'SERVER'. For Server, we will put 'this_host' instead of the hostname. This way, the license server will be recognized by any of its DNS hostnames regardless of different domains.
SERVER this_host 80c16e65897d 27008
Schrodinger Hosts & Queue Config Files
The schrodinger.hosts file exists within the schrodinger current installation directory. schrodinger.hosts contains the list of queues available for schrodinger to use. The first host entry should just be a localhost entry to allow users to run Schrodinger on their local machine. Other host entries will contain information such as what queue to use, how many processors are available, what GPUs exist, if parallelization is enabled, etc.
schrodinger.hosts file
Name: gimel-sge host: gimel queue: SGE qargs: -q gpu.q -pe local %NPROC% -l gpu=1 tmpdir: /scratch processors: 32 gpgpu: 0, nvidia gpgpu: 1, nvidia gpgpu: 2, nvidia gpgpu: 3, nvidia parallel: 1 Name: gimel2-sge host: gimel2 queue: SGE qargs: -q gpu.q -pe local %NPROC% -l gpu=1 tmpdir: /scratch processors: 32 gpgpu: 0, nvidia gpgpu: 1, nvidia gpgpu: 2, nvidia gpgpu: 3, nvidia parallel: 1 name: gimel2-n923q host: gimel2 queue: SGE qargs: -q n-9-23.q -pe local %NPROC% tmpdir: /scratch processors: 80 parallel: 1
Since we use opengrid engine, we must configure the queue config file that exists for SGE. This file is located in the $SCHRODINGER/queues/SGE/config.
QPATH=/usr/bin/ QPROFILE=/nfs/ge/ucsf.bks/cell/common/settings.sh QSUB=qsub QDEL=qdel QSTAT=qstat LICENSE_CHECKING=yes
Troubleshooting: D-Bus Errors
We had a period where our jobs were dying upon submission. We would get this strange error message:
process 23478: arguments to dbus_move_error() were incorrect, assertion "(dest) == NULL || !dbus_error_is_set ((dest))" failed in file dbus-errors.c line 278. This is normally a bug in some application using the D-Bus library. D-Bus not built with -rdynamic so unable to print a backtrace Fatal Python error: Aborted
It turns out, this was due to SELinux being on. As a temporary workaround, I have disabled SELinux on hosts that were experiencing this issue. We'll need to dig deeper in /var/log/audit/audit.log to diagnose what was wrong.
Troubleshooting: All processes go onto the same GPU
When we submit GPU jobs via Maestro/Desmond, we can choose the number of GPUs we use in the run. However, when we first did this while declaring that we wanted four GPUs to be used in a process, Schrodinger would allocate the four separate processes all on the same GPU. To address this, we have to log into the GPU nodes and set the GPUs into exclusive mode. This means that no more than one process would run on a GPU at a time.
$ nvidia-smi -c 3
Found on this webpage: https://www.schrodinger.com/kb/1834
Troubleshooting: Multi-process jobs only finishes a single process
Ligprep jobs get sent to a node to begin. We've been sending ligprep jobs that would utilize six additional parallel processes. These parallel processes would be spawned as six sub-jobs. Unfortunately, when we first tried, only the head process would spawn but non of the sub-jobs would get submitted. This happened because of the way Schrodinger tries to spawn additional subprocesses. The head job would run on a compute node and then try to contact an SGE submit host (gimel,gimel2) via SSH. If you do not have passwordless SSH enabled, the job would fail to spawn sub-jobs. What you need to do is create an ssh-key in your home directory that would solely be used when an SSH connection is initialized between a compute node and gimel/gimel2. Since your home directory is NFS-mounted across all nodes on the cluster, you only need to create an ssh-key and append the public key to your authorized_keys file under .ssh.
$ ssh-keygen (follow steps and don't make a password) $ cat ~/.ssh/<new key>.pub >> ~/.ssh/authorized_keys $ vi ~/.ssh/config Host gimel gimel2 IdentityFile ~/.ssh/compute_to_gimel
This way, the process on the compute node can successfully contact the SGE submission hosts and spawn additional subprocesses.