Difference between revisions of "How to do parallel search of smi files on the cluster"

From DISI
Jump to navigation Jump to search
Line 51: Line 51:
  
 
<pre>
 
<pre>
qstat                                             # check the status of jobs, example is shown below.
+
qstat                         # check the status of jobs, example is shown below.
  
 
-bash-4.1$ qstat
 
-bash-4.1$ qstat

Revision as of 17:44, 19 July 2018

This tutorial shows how to do parallel search of smi files on the cluster. The files and scripts can be found in /nfs/home/jizhou/ex7/2D/test-parallel @gimel.compbio.ucsf.edu

Create a folder with the following files and scripts

SUBMIT.sh
input.txt
search_smi.sh
merge.sh

SUBMIT.sh

SUBMIT.sh contains bash code for qsub. SUBMIT.sh specify the qsub command, parameters for qsub, input file, the function script, parameters for the function. A example is shown below.

#!/bin/bash

/nfs/soft/tools/utils/qsub-slice/qsub-mr \                 #  The qsub command
    -l 5 \                                                 #  The number of lines to be handled by each task, here is 5
    -N test \                                              #  The name of the queue to submit to
    input.txt \                                            #  The input file names and directory
    ./search_smi.sh \                                      #  The searching function to be performed 
    -q "CS(=O)(=O)CCNCc1ccccc1"                            #  Parameter for search_smi.sh, the input query for searching


input.txt

The input file names and directory. An example of input.txt is shown below. You can use ls *.smi > input.txt to generate this file.

/nfs/home/jizhou/ex7/2D/CD/CDAA.smi
/nfs/home/jizhou/ex7/2D/CD/CDAB.smi
/nfs/home/jizhou/ex7/2D/CD/CDAC.smi
/nfs/home/jizhou/ex7/2D/CD/CDAD.smi
/nfs/home/jizhou/ex7/2D/CD/CDAE.smi
/nfs/home/jizhou/ex7/2D/CD/CDAF.smi
...


search_smi.sh

The searching function used by qsub. The core function of search_smi.sh is mol2img_trial which is located in "/nfs/home/jizhou/work/Projects/smi_index/dotmatics/". mol2img_trial generates index for the smi file to speedup searching. search_smi.sh requires an input query for searching. An example is shown below

-q "CS(=O)(=O)CCNCc1ccccc1"


run SUBMIT.sh

Run SUBMIT.sh to submit the job to cluster. The job will be run on the background. When it finishes, a new directory outputs will be created in current folder. The outputs will be stored in outputs/. You can use the following command to check qsub status, start or stop a job.

qstat                         # check the status of jobs, example is shown below.

-bash-4.1$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
6511305 1.25000 test-map   jizhou       r     07/19/2018 10:42:43 all.q@n-5-29.cluster.ucsf.bksl     1 1
6511305 0.75000 test-map   jizhou       r     07/19/2018 10:42:43 all.q@n-9-20.cluster.ucsf.bksl     1 2
6511305 0.58333 test-map   jizhou       r     07/19/2018 10:42:43 all.q@n-1-132.cluster.ucsf.bks     1 3
6511305 0.50000 test-map   jizhou       r     07/19/2018 10:42:43 all.q@n-9-21.cluster.ucsf.bksl     1 4