Db2multipdb.py

From DISI
Jump to navigation Jump to search

db2multipdb.py is a small python script used to decode Flexibase .db files to multipdb files (that can be read by any viewer) and do some simple checking on the .db file.

Usage: db2multipdb.py [options] file.db [more db files]

Convert .db files to multiple pdb files, check for errors

Options:
 -h, --help     show this help message and exit
 -v, --verbose  lots of debugging output
 -n, --nopdb    don't write pdb files, just do broken checking

The script is located at

$DOCK_BASE/scripts/db2multipdb.py

or alternatively

~rgc/Source/bks_src/db2multipdb.py

If you don't have python2.6 in your path you'll have to put it there or run the program like this

/usr/arch/bin/python2.6 ~rgc/Source/bks_src/db2multipdb.py

Verbose output (-v flag) is not typically needed but available. Not writing pdb files (-n flag) is a useful option if you don't need the pdb files and just want to do the broken checking. Each separate .db entry generates the following output to stdout:

P00000008 being processed now 1
P00000008 1 errors of each type: 542 0 0 0 no errors: 817 total models 1359

roughly interpreted as:

zincid^^^ #times zincid seen^^^^ a^^ b c d ^^^^^^^^^^ #without errors  #total

where a,b,c,d type errors are defined as

a is atoms closer than 0.95 angstroms
b is oxygen atoms closer than 2.0 angstroms
c is heavy atoms closer than 1.07 angstroms
d is no other atoms within 2.2 angstroms

Note that for a given conformation, only one error of any type is reported. Type d (critical) errors take precedence over type a,b,c errors, if there is a type d error it will be reported. The number of errors of each type plus the number of models without errors always equals the total number of models.

Exactly which atoms have these errors can be seen with the -v option. Errors of the first 3 types are expected due to the 'mix-and-match' conformations generated by separate flexible branches being recombined and overlapping. Errors of type d should not occur but have been known to previously.

If pdb output is not suppressed files will be written named P00000008.001.pdb where the first 9 characters are the ZINCID read from the db file, then a unique counter (since a .db file can contain multiple .db entries for one ZINCID). Each pdb file is a normal pdb file, with each MODEL as one unique conformation produced. Obviously these can be quite large and writing them to disk takes much longer than anything else the code does. If you load this pdb file in PyMOL for instance and then hit the 'play' button it will go through the entire set. Obviously other post-processing or conversion is possible.

Questions? contact Ryan Coleman