Substructure searching: Difference between revisions
No edit summary |
No edit summary |
||
(3 intermediate revisions by the same user not shown) | |||
Line 12: | Line 12: | ||
| | | | ||
------- scripts ------ submit.csh | ------- scripts ------ submit.csh | ||
| | | | ||
|------ setup_substructure_searching_files.py | |------ setup_substructure_searching_files.py | ||
Line 43: | Line 37: | ||
cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/setup_substructure_searching_files.py . | cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/setup_substructure_searching_files.py . | ||
cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit.csh . | cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit.csh . | ||
cd ../ | cd ../ | ||
Line 53: | Line 44: | ||
NS(=O)(=O)c1cc([F,Cl,Br,I])ccc1[OD1] 2 | NS(=O)(=O)c1cc([F,Cl,Br,I])ccc1[OD1] 2 | ||
5) | 5) Split the ZINC-downloader-2D-smi.database_index file into chunks | ||
cd working | cd working | ||
python ../scripts | python ../scripts/setup_substructure_searching_files.py . sub_searching_ ZINC-downloader-2D-smi.database_index number_of_chunks(change it to real number) count | ||
I suggest 3 SMILES files per chunk, so change the number_of_chunks based on your real size of tranches. | |||
6) Submit substructure searching jobs | |||
csh ../scripts/submit.csh full_path_of_sub_pattern.smarts | |||
7) Collect results | |||
cat sub_searching_*/*.extract.output.smi > output.smi |
Latest revision as of 19:14, 13 September 2017
Written by Jiankun Lyu, 2017/09/13
The hierarchy of the directories:
substructure_searching----- working | | | |------ ZINC-downloader-2D-smi.database_index | | | |------ sub_pattern.smarts | | | ------- scripts ------ submit.csh | |------ setup_substructure_searching_files.py
1) Make those directories above.
mkdir substructure_searching cd substructure_searching mkdir working mkdir scripts
2) Download databases index from ZINC
2.1) Go to ZINC http://zinc15.docking.org/tranches/home/#
2.2) Choose the tranches you want to do substructure searching
2.3) download the databases index file
2.4) download the file above and save it as ZINC-downloader-2D-smi.database_index, then upload the file to the working directory
3) Copy scripts from my path.
cd scripts cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/setup_substructure_searching_files.py . cp /mnt/nfs/home/jklyu/zzz.script/analogs_searching/multi_sub_searching/submit.csh . cd ../
4) Put SMARTS patterns you want to search in the sub_pattern.smarts file and give each SMARTS pattern a unique number or name
Here is an example in the sub_pattern.smarts file NS(=O)(=O)c1cccc([F,Cl,Br,I])c1[OD1] 1 NS(=O)(=O)c1cc([F,Cl,Br,I])ccc1[OD1] 2
5) Split the ZINC-downloader-2D-smi.database_index file into chunks
cd working python ../scripts/setup_substructure_searching_files.py . sub_searching_ ZINC-downloader-2D-smi.database_index number_of_chunks(change it to real number) count
I suggest 3 SMILES files per chunk, so change the number_of_chunks based on your real size of tranches.
6) Submit substructure searching jobs
csh ../scripts/submit.csh full_path_of_sub_pattern.smarts
7) Collect results
cat sub_searching_*/*.extract.output.smi > output.smi