Bemis-Murcko Scaffold Analysis

From DISI
Jump to: navigation, search

Written by Jiankun Lyu, 20170918

Please run Large-scale SMILES Requesting and Fingerprints Converting first to get the SMILES files, then run the scaffold analysis below.

The hierarchy of the directories:

scaffold_analysis/----- working/ 
              |                |
              |                |------ combined.smi(soft link)
              |                | 
              |                |------ chunked_smiles_files/
              |                                                 
              |                                                 
              |
              ------- scripts/ ------ submit.csh
                              |
                              |------ setup_BM_scaffold_analysis.py
                              |
                              |------ combine_scaffold_smi.py
                              |
                              |------ scaffold_analysis.py

1) make directories and copy scripts

mkdir scaffold_analysis
cd scaffold_analysis
mkdir working
mkdir scripts
cd working
mkdir chunked_smiles_files
ln -s /path/to/combined.smi
cd ../scripts
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/BM_scaffold_analysis/submit.csh .
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/BM_scaffold_analysis/setup_BM_scaffold_analysis.py .
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/BM_scaffold_analysis/combine_scaffold_smi.py .
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/BM_scaffold_analysis/scaffold_analysis.py .
cd ../

2) Split the SMILES file

cd working/chunked_smiles_files
split -d -a 3 -l 600000 ../combined.smi combined.smi_
cd ../

3) Create a zincid.sdi file

ls /full/path/to/chunked_smiles_files/combined.smi_* > all_smiles.sdi

4) Set up files and directories

python ../scripts/setup_BM_scaffold_analysis.py . scaffold_analysis_ all_smiles.sdi 60 count

5) Submit jobs of scaffold analysis

csh ../scripts/submit.csh

6) Collect data from directories

python ../scripts/combine_scaffold_smi.py

7) Analyze data

python ../scripts/scaffold_analysis.py combined.scaffold.energy.smi output_prefix