Substructure searching: Difference between revisions
No edit summary  | 
				No edit summary  | 
				||
| (6 intermediate revisions by the same user not shown) | |||
| Line 11: | Line 11: | ||
                |                                                    |                 |                                                    | ||
                |  |                 |  | ||
                -------   |                 ------- scripts ------ submit.csh  | ||
                                |  |                                 |  | ||
                                |------ setup_substructure_searching_files.py  |                                 |------ setup_substructure_searching_files.py  | ||
| Line 25: | Line 19: | ||
  cd substructure_searching  |   cd substructure_searching  | ||
  mkdir working  |   mkdir working  | ||
  mkdir   |   mkdir scripts  | ||
2) Download databases index from ZINC  | 2) Download databases index from ZINC  | ||
| Line 32: | Line 26: | ||
2.2) Choose the tranches you want to do substructure searching  | 2.2) Choose the tranches you want to do substructure searching  | ||
[[File:subsearching_fig1.png|thumb|center|500px|Choose the tranches you want to do substructure searching]]  | |||
2.3) download the databases index file  | 2.3) download the databases index file  | ||
[[File:subsearching_fig2.png|thumb|center|500px|download the databases index file]]  | |||
2.4) download the file above and save it as ZINC-downloader-2D-smi.database_index, then upload the file to the working directory  | |||
3) Copy scripts from my path.  | 3) Copy scripts from my path.  | ||
  cd   |   cd scripts  | ||
  cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/setup_substructure_searching_files.py .  |   cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/setup_substructure_searching_files.py .  | ||
  cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit.csh .  |   cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit.csh .  | ||
  cd ../  | |||
4) Put SMARTS patterns you want to search in the sub_pattern.smarts file and give each SMARTS pattern a unique number or name  | |||
 Here is an example in the sub_pattern.smarts file  | |||
 NS(=O)(=O)c1cccc([F,Cl,Br,I])c1[OD1] 1  | |||
 NS(=O)(=O)c1cc([F,Cl,Br,I])ccc1[OD1] 2  | |||
5) Split the ZINC-downloader-2D-smi.database_index file into chunks  | |||
 cd working  | |||
 python ../scripts/setup_substructure_searching_files.py . sub_searching_  ZINC-downloader-2D-smi.database_index number_of_chunks(change it to real number) count  | |||
I suggest 3 SMILES files per chunk, so change the number_of_chunks based on your real size of tranches.  | |||
6) Submit substructure searching jobs  | |||
  csh ../scripts/submit.csh full_path_of_sub_pattern.smarts  | |||
7) Collect results  | |||
  cat sub_searching_*/*.extract.output.smi > output.smi  | |||
Latest revision as of 19:14, 13 September 2017
Written by Jiankun Lyu, 2017/09/13
The hierarchy of the directories:
substructure_searching----- working 
              |                |
              |                |------ ZINC-downloader-2D-smi.database_index
              |                | 
              |                |------ sub_pattern.smarts
              |                                                 
              |                                                 
              |
              ------- scripts ------ submit.csh
                              |
                              |------ setup_substructure_searching_files.py
1) Make those directories above.
mkdir substructure_searching cd substructure_searching mkdir working mkdir scripts
2) Download databases index from ZINC
2.1) Go to ZINC http://zinc15.docking.org/tranches/home/#
2.2) Choose the tranches you want to do substructure searching
2.3) download the databases index file
2.4) download the file above and save it as ZINC-downloader-2D-smi.database_index, then upload the file to the working directory
3) Copy scripts from my path.
cd scripts cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/setup_substructure_searching_files.py . cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit.csh . cd ../
4) Put SMARTS patterns you want to search in the sub_pattern.smarts file and give each SMARTS pattern a unique number or name
Here is an example in the sub_pattern.smarts file NS(=O)(=O)c1cccc([F,Cl,Br,I])c1[OD1] 1 NS(=O)(=O)c1cc([F,Cl,Br,I])ccc1[OD1] 2
5) Split the ZINC-downloader-2D-smi.database_index file into chunks
cd working python ../scripts/setup_substructure_searching_files.py . sub_searching_ ZINC-downloader-2D-smi.database_index number_of_chunks(change it to real number) count
I suggest 3 SMILES files per chunk, so change the number_of_chunks based on your real size of tranches.
6) Submit substructure searching jobs
csh ../scripts/submit.csh full_path_of_sub_pattern.smarts
7) Collect results
cat sub_searching_*/*.extract.output.smi > output.smi