Whole Library TC to Knowns Calculations

From DISI
Jump to: navigation, search

Written by Jiankun Lyu, 20180524

The hierarchy of the directories:

TC_calculations---------- working 
              |                |
              |                |------ smi.sdi
              |                | 
              |                |------ db_smi-------combined.smi
              |                                                 
              |                                                 
              |
              ------- scripts ------ submit.csh
                              |
                              |------ check_outputs.csh
                              |
                              |------ setup_tc_calculations.py
                              |
                              |------ make_chunks_for_file_new.py
                              |
                              |------ combine_tc_matrix.py

1) Make those directories above.

mkdir TC_calculations
cd TC_calculations
mkdir working
mkdir scripts

2) Query SMILES from ZINC or prepare by yourself.

2.1) Query SMILES from ZINC

See http://wiki.docking.org/index.php/Large-scale_SMILES_Requesting_and_Fingerprints_Converting

2.2) Prepare by yourself

2.2.1) Combine all SMILES of the leadlike molecules at your local directory

2.2.2) Run a script to match ZINC IDs with SMILES

2.3) Copy the combined SMILES file to db_smi directory

3) Copy scripts from my path.

cd scripts
cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/check_outputs.csh .
cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/combine_tc_matrix.py .
cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/make_chunks_for_file_new.py .
cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/submit.csh .
cp /mnt/nfs/home/jklyu/zzz.script/TC_to_knowns/setup_tc_calculations.py .
cd ../

4) Chunk combined SMILES file.

cd working
mkdir db_smi
cd db_smi
python ../../scripts/make_chunks_for_file_new.py combined.smi combined.smi 100 .

5) Make sdi file.

cd ../
ls `pwd`/db_smi/combined_*.smi > smi.sdi

6) Set up your TC calculations

python ../scripts/setup_tc_calculations.py . TC_cal_ smi.sdi 101 count

7) Submit jobs

csh ../scripts/submit.csh name.uint16.fp name.smi name.uint16.count

name.uint16.fp, name.smi, name.uint16.count are a set of molecules from the bioactive molecules ( the like-molecules you want to look for in the sea of compounds).


8) Check unfinished jobs

cd db_smi
csh ../../scripts/check_outputs.csh 101 combined

9) Combine all the TC results

python ../../scripts/combine_tc_matrix.py 101 combined