Substructure searching: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 12: Line 12:
               |
               |
               ------- scripts ------ submit.csh
               ------- scripts ------ submit.csh
                              |
                              |------ submit_sub_search.csh
                              |
                              |------ run_sub_search.csh
                              |
                              |------ search_multi_substructures.py
                               |
                               |
                               |------ setup_substructure_searching_files.py
                               |------ setup_substructure_searching_files.py
Line 43: Line 37:
  cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/setup_substructure_searching_files.py .
  cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/setup_substructure_searching_files.py .
  cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit.csh .
  cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit.csh .
cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/run_sub_search.csh .
cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit_sub_search.csh .
cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/search_multi_substructures.py .
  cd ../
  cd ../


Line 53: Line 44:
  NS(=O)(=O)c1cc([F,Cl,Br,I])ccc1[OD1] 2
  NS(=O)(=O)c1cc([F,Cl,Br,I])ccc1[OD1] 2


5) split the ZINC-downloader-2D-smi.database_index file into chunks
5) Split the ZINC-downloader-2D-smi.database_index file into chunks
  cd working
  cd working
  python ../scripts
  python ../scripts/setup_substructure_searching_files.py . sub_searching_  ZINC-downloader-2D-smi.database_index number_of_chunks(change it to real number) count
I suggest 3 SMILES files per chunk, so change the number_of_chunks based on your real size of tranches.
 
6) Submit substructure searching jobs
csh ../scripts/submit.csh full_path_of_sub_pattern.smarts
 
7) Collect results
cat sub_searching_*/*.extract.output.smi > output.smi

Latest revision as of 19:14, 13 September 2017

Written by Jiankun Lyu, 2017/09/13

The hierarchy of the directories:

substructure_searching----- working 
              |                |
              |                |------ ZINC-downloader-2D-smi.database_index
              |                | 
              |                |------ sub_pattern.smarts
              |                                                 
              |                                                 
              |
              ------- scripts ------ submit.csh
                              |
                              |------ setup_substructure_searching_files.py

1) Make those directories above.

mkdir substructure_searching
cd substructure_searching
mkdir working
mkdir scripts

2) Download databases index from ZINC

2.1) Go to ZINC http://zinc15.docking.org/tranches/home/#

2.2) Choose the tranches you want to do substructure searching

Choose the tranches you want to do substructure searching

2.3) download the databases index file

download the databases index file

2.4) download the file above and save it as ZINC-downloader-2D-smi.database_index, then upload the file to the working directory

3) Copy scripts from my path.

cd scripts
cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/setup_substructure_searching_files.py .
cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit.csh .
cd ../

4) Put SMARTS patterns you want to search in the sub_pattern.smarts file and give each SMARTS pattern a unique number or name

Here is an example in the sub_pattern.smarts file
NS(=O)(=O)c1cccc([F,Cl,Br,I])c1[OD1] 1
NS(=O)(=O)c1cc([F,Cl,Br,I])ccc1[OD1] 2

5) Split the ZINC-downloader-2D-smi.database_index file into chunks

cd working
python ../scripts/setup_substructure_searching_files.py . sub_searching_  ZINC-downloader-2D-smi.database_index number_of_chunks(change it to real number) count

I suggest 3 SMILES files per chunk, so change the number_of_chunks based on your real size of tranches.

6) Submit substructure searching jobs

csh ../scripts/submit.csh full_path_of_sub_pattern.smarts

7) Collect results

cat sub_searching_*/*.extract.output.smi > output.smi