Difference between revisions of "Another get poses.py"

From DISI
Jump to: navigation, search
(Created page with "Another get_poses.py script modified on top of getposes_blazing_faster.py from Reed & Trent source /nfs/home/yingyang/.cshrc_opencadd python /nfs/home/yingyang/scripts/get_p...")
 
(One intermediate revision by one user not shown)
Line 1: Line 1:
Another get_poses.py script modified on top of getposes_blazing_faster.py from Reed & Trent
+
3/25/2020 Ying
  
source /nfs/home/yingyang/.cshrc_opencadd
+
Poses are needed for Shuo's interaction filter and strain filter, sometimes we need to get poses pre-clustering. Owing to the need, here's another get_poses.py script modified on top of getposes_blazing_faster.py from Reed & Trent.
 +
The idea is that we only want to get one pose per zincid with the best dock score. So the script read extract_all.sort.uniq.txt file, and store the min_score for each zincid. When processing mol2.gz file, check if this molecule's mol2 with zincid matches the min_score, otherwise, skip to the next molecule.
  
python /nfs/home/yingyang/scripts/get_poses.py -h
+
First, set environment variable
usage: get_poses.py [-h] [-d DIR] [-s SCORE] [-n NUM] [-f FILE] [-o OUT]
+
source /nfs/home/yingyang/.cshrc_opencadd
                    [-z GZ_FILE]
+
  
optional arguments:
+
Get help information:
 +
python /nfs/home/yingyang/scripts/get_poses.py -h
 +
usage: get_poses.py [-h] [-d DIR] [-s SCORE] [-n NUM] [-f FILE] [-o OUT]
 +
                    [-z GZ_FILE]
 +
optional arguments:
 
   -h, --help  show this help message and exit
 
   -h, --help  show this help message and exit
 
   -d DIR      path to where docking is located (default: )
 
   -d DIR      path to where docking is located (default: )
Line 16: Line 20:
 
   -o OUT      file name for poses (default: poses.mol2)
 
   -o OUT      file name for poses (default: poses.mol2)
 
   -z GZ_FILE  file name for input (default: test.mol2.gz)
 
   -z GZ_FILE  file name for input (default: test.mol2.gz)
 +
 +
Example 1, get top 6k molecules from extract_all.sort.uniq.txt (in the docking directory). (getposes routine)
 +
  python /nfs/home/yingyangg/scripts/get_poses.py -s extract_all.sort.uniq.txt -n 6000 -o poses_top6k.mol2
 +
 +
Example 2, only get molecules with names listed in a file, and cut at top 10k.
 +
  python /nfs/home/yingyangg/scripts/get_poses.py -s extract_all.sort.uniq.txt -n 10000 -f <zincid.txt> -o poses_interested.mol2

Revision as of 17:16, 24 March 2020

3/25/2020 Ying

Poses are needed for Shuo's interaction filter and strain filter, sometimes we need to get poses pre-clustering. Owing to the need, here's another get_poses.py script modified on top of getposes_blazing_faster.py from Reed & Trent. The idea is that we only want to get one pose per zincid with the best dock score. So the script read extract_all.sort.uniq.txt file, and store the min_score for each zincid. When processing mol2.gz file, check if this molecule's mol2 with zincid matches the min_score, otherwise, skip to the next molecule.

First, set environment variable

source /nfs/home/yingyang/.cshrc_opencadd

Get help information:

python /nfs/home/yingyang/scripts/get_poses.py -h
usage: get_poses.py [-h] [-d DIR] [-s SCORE] [-n NUM] [-f FILE] [-o OUT]
                    [-z GZ_FILE]
optional arguments:
 -h, --help  show this help message and exit
 -d DIR      path to where docking is located (default: )
 -s SCORE    path to where the extract all file is (default:
             extract_all.sort.uniq.txt)
 -n NUM      number of molecules (poses) to get. (default: 500)
 -f FILE     file contained ligand names to extract (default: None)
 -o OUT      file name for poses (default: poses.mol2)
 -z GZ_FILE  file name for input (default: test.mol2.gz)

Example 1, get top 6k molecules from extract_all.sort.uniq.txt (in the docking directory). (getposes routine)

 python /nfs/home/yingyangg/scripts/get_poses.py -s extract_all.sort.uniq.txt -n 6000 -o poses_top6k.mol2

Example 2, only get molecules with names listed in a file, and cut at top 10k.

 python /nfs/home/yingyangg/scripts/get_poses.py -s extract_all.sort.uniq.txt -n 10000 -f <zincid.txt> -o poses_interested.mol2