How to do parallel search of smi files on the cluster: Difference between revisions
No edit summary |
No edit summary |
||
Line 15: | Line 15: | ||
#!/bin/bash | #!/bin/bash | ||
.../qsub-mr \ | .../qsub-mr \ # The qsub command | ||
-l 5 \ # The number of lines to be handled by each task, here is 5 | -l 5 \ # The number of lines to be handled by each task, here is 5 | ||
-N test \ # The name of the queue to submit to | -N test \ # The name of the queue to submit to |
Revision as of 17:55, 19 July 2018
This tutorial shows how to do parallel search of smi files on the cluster. The files and scripts can be found in /nfs/home/jizhou/ex7/2D/test-parallel @gimel.compbio.ucsf.edu
Create a folder with the following files and scripts
SUBMIT.sh input.txt search_smi.sh merge.sh
SUBMIT.sh
SUBMIT.sh contains bash code for qsub. SUBMIT.sh specify the qsub command, parameters for qsub, input file, the function script, parameters for the function. A example is shown below.
#!/bin/bash .../qsub-mr \ # The qsub command -l 5 \ # The number of lines to be handled by each task, here is 5 -N test \ # The name of the queue to submit to input.txt \ # The input file names and directory ./search_smi.sh \ # The searching function to be performed -q "CS(=O)(=O)CCNCc1ccccc1" # Parameter for search_smi.sh, the input query for searching
input.txt
The input file names and directory. An example of input.txt is shown below. You can use ls *.smi > input.txt to generate this file.
.../ex7/2D/CD/CDAA.smi .../ex7/2D/CD/CDAB.smi .../ex7/2D/CD/CDAC.smi .../ex7/2D/CD/CDAD.smi .../ex7/2D/CD/CDAE.smi .../ex7/2D/CD/CDAF.smi ...
search_smi.sh
The searching function used by qsub. The core function of search_smi.sh is mol2img_trial which is located in "/nfs/home/jizhou/work/Projects/smi_index/dotmatics/". mol2img_trial generates index for the smi file to speedup searching. search_smi.sh requires an input query for searching. An example is shown below
-q "CS(=O)(=O)CCNCc1ccccc1"
run SUBMIT.sh
Run SUBMIT.sh to submit the job to cluster. The job will be run on the background. When it finishes, a new directory outputs will be created in current folder. The outputs will be stored in outputs/. You can use the following command to check qsub status, start or stop a job. For more information, please refer to qstat
qstat # check the status of jobs, example is shown below. -bash-4.1$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 6511305 1.25000 test-map jizhou r 07/19/2018 10:42:43 all.q@n-5-29.cluster.ucsf.bksl 1 1 6511305 0.75000 test-map jizhou r 07/19/2018 10:42:43 all.q@n-9-20.cluster.ucsf.bksl 1 2 6511305 0.58333 test-map jizhou r 07/19/2018 10:42:43 all.q@n-1-132.cluster.ucsf.bks 1 3 6511305 0.50000 test-map jizhou r 07/19/2018 10:42:43 all.q@n-9-21.cluster.ucsf.bksl 1 4
merge.sh
When all jobs are completed, run merge.sh to check the outputs. Sample outputs are shown below
CS(=O)(=O)CCNCc1ccncc1 ZINC000037491283|70.6 CS(=O)(=O)CCNCc1ccc(O)cc1 ZINC000037740328|70.6 CS(=O)(=O)CCNCCOc1ccccc1 ZINC000048777006|70.6 CS(=O)(=O)CCNCc1ccccc1 ZINC000037491280|100.0 CS(=O)(=O)CCNCCc1ccccc1 ZINC000037491281|75.0 ...