Whole Library TC to Knowns Calculations
Written by Jiankun Lyu, 20180524
The hierarchy of the directories:
TC_calculations---------- working | | | |------ smi.sdi | | | |------ db_smi-------combined.smi | | | ------- scripts ------ submit.csh | |------ check_outputs.csh | |------ setup_tc_calculations.py | |------ make_chunks_for_file_new.py | |------ combine_tc_matrix.py
1) Make those directories above.
mkdir TC_calculations cd TC_calculations mkdir working mkdir scripts
2) Query SMILES from ZINC or prepare by yourself.
2.1) Query SMILES from ZINC
See http://wiki.docking.org/index.php/Large-scale_SMILES_Requesting_and_Fingerprints_Converting
2.2) Prepare by yourself
2.2.1) Combine all SMILES of the leadlike molecules at your local directory
2.2.2) Run a script to match ZINC IDs with SMILES
2.3) Copy the combined SMILES file to db_smi directory
3) Copy scripts from my path.
cd scripts cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/check_outputs.csh . cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/combine_tc_matrix.py . cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/make_chunks_for_file_new.py . cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/submit.csh . cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/setup_tc_calculations.py . cd ../
4) Chunk combined SMILES file.
cd working mkdir db_smi cd db_smi python ../../scripts/make_chunks_for_file_new.py combined.smi combined.smi 100 .
5) Make sdi file.
cd ../ ls `pwd`/db_smi/combined_*.smi > smi.sdi
6) Set up your TC calculations
python ../scripts/setup_tc_calculations.py . TC_cal_ smi.sdi 101 count
7) Submit jobs
csh ../scripts/submit.csh name.uint16.fp name.smi name.uint16.count
name.uint16.fp, name.smi, name.uint16.count are a set of molecules from the bioactive molecules ( the like-molecules you want to look for in the sea of compounds).
8) Check unfinished jobs
cd db_smi csh ../../scripts/check_outputs.csh 101 combined
9) Combine all the TC results
python ../../scripts/combine_tc_matrix.py 101 combined