Large-scale SMILES requesting: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
Line 12: Line 12:
               |
               |
               ------- scripts/ ------ submit.csh
               ------- scripts/ ------ submit.csh
                              |
                              |------ setup_substructure_searching_files.py
                               |
                               |
                               |------ setup_converting_fps_files.py
                               |------ setup_converting_fps_files.py
Line 24: Line 22:
This tutorial is for requesting a large number of SMILES for docking results from ZINC server. Usually, the number is larger than 5M ZINC IDs.
This tutorial is for requesting a large number of SMILES for docking results from ZINC server. Usually, the number is larger than 5M ZINC IDs.


1) Get ZINC ID and energy columns from the extract_all.sort.uniq.txt file
1) make directories and copy scripts
 
mkdir smiles_requesting
cd smiles_requesting
mkdir working
mkdir scripts
cd working
mkdir db_zincid
ln -s /path/to/extract_all.sort.uniq.txt
cd ../scripts
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/best_first_clustering/converting_fps/submit.csh .
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/best_first_clustering/converting_fps/setup_converting_fps_files.py .
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/best_first_clustering/converting_fps/combine_smi_and_fp.py
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/best_first_clustering/converting_fps/check_outputs.csh


2) Get ZINC ID and energy columns from the extract_all.sort.uniq.txt file
  split -d -l 300000 ../combined.smi combined.smi_
  split -d -l 300000 ../combined.smi combined.smi_

Revision as of 22:25, 18 September 2017

Written by Jiankun Lyu, 20170918

The hierarchy of the directories:

smiles_requesting/----- working/ 
              |                |
              |                |------ extract_all.sort.uniq.txt file(soft link)
              |                | 
              |                |------ db_zincid/
              |                                                 
              |                                                 
              |
              ------- scripts/ ------ submit.csh
                              |
                              |------ setup_converting_fps_files.py
                              |
                              |------ combine_smi_and_fp.py
                              |
                              |------ check_outputs.csh


This tutorial is for requesting a large number of SMILES for docking results from ZINC server. Usually, the number is larger than 5M ZINC IDs.

1) make directories and copy scripts

mkdir smiles_requesting
cd smiles_requesting
mkdir working
mkdir scripts
cd working
mkdir db_zincid
ln -s /path/to/extract_all.sort.uniq.txt
cd ../scripts
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/best_first_clustering/converting_fps/submit.csh .
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/best_first_clustering/converting_fps/setup_converting_fps_files.py .
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/best_first_clustering/converting_fps/combine_smi_and_fp.py
cp /mnt/nfs/home/jklyu/zzz.script/large_scale_docking/cluster_analysis/best_first_clustering/converting_fps/check_outputs.csh

2) Get ZINC ID and energy columns from the extract_all.sort.uniq.txt file

split -d -l 300000 ../combined.smi combined.smi_