http://wiki.docking.org/index.php?title=DB2_File_Format&feed=atom&action=historyDB2 File Format - Revision history2024-03-29T01:33:42ZRevision history for this page on the wikiMediaWiki 1.39.1http://wiki.docking.org/index.php?title=DB2_File_Format&diff=9369&oldid=prevTeague Sterling: Created page with "This page explains the DB2 file format used in DOCK37. = Nomenclature Definitions = * Conf - one set of atoms that moves together with a single position per atom. * Set - a ..."2016-04-19T21:38:33Z<p>Created page with "This page explains the DB2 file format used in DOCK37. = Nomenclature Definitions = * Conf - one set of atoms that moves together with a single position per atom. * Set - a ..."</p>
<p><b>New page</b></p><div>This page explains the DB2 file format used in DOCK37.<br />
<br />
= Nomenclature Definitions =<br />
<br />
* Conf - one set of atoms that moves together with a single position per atom.<br />
* Set - a group of conformations that completely defines one position for each atom in a ligand.<br />
* Cluster - Not yet implamented in DOCK3.7<br />
* Cloud - Not yet implamented in DOCK3.7<br />
<br />
= Record Types =<br />
*T type information (implicitly assumed)<br />
*M molecule (4 lines req'd, after that they are optional, 24 lines max)<br />
*A atoms<br />
*B bond<br />
*X xyz <br />
*R rigid xyz for matching (can actually be any xyzs) <br />
*C conformation<br />
*S sets<br />
*D clusters<br />
*E end of molecule<br />
<br />
T ## namexxxx (implicitly assumed to be the standard 7)<br />
M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters<br />
M charge polar_solv apolar_solv total_solv surface_area<br />
M smiles<br />
M longname<br />
[M arbitrary information preserved for writing out]<br />
A stuff about each atom, 1 per line <br />
B stuff about each bond, 1 per line<br />
X coordnum atomnum confnum x y z <br />
R rigidnum color x y z<br />
C confnum coordstart coordend<br />
S setnum #lines #confs_total broken hydrogens omega_energy<br />
S setnum linenum #confs confs [until full column]<br />
D clusternum setstart setend matchstart matchend #additionalmatching<br />
D matchnum color x y z<br />
E <br />
<br />
With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.<br />
<br />
notes: 17 children groups/group per line in current scheme.<br />
9 children confs/group per line.<br />
9 children confs/conf per line.<br />
8 confs/set per line.<br />
groups/confs with no children are written out.<br />
<br />
on the atom line, dt is dock type and co is color.<br />
<br />
1 2 3 4 5 6 7<br />
01234567890123456789012345678901234567890123456789012345678901234567890123456789<br />
T ## typename<br />
M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU<br />
M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
[M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]<br />
A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
B NUM ATO ATO TY<br />
X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
C CONFNO COORDSTAR COORDENDX<br />
S SETIDX #LINES #CO C H +ENERGY.XXX<br />
S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS<br />
D CLUSID STASET ENDSET MST MEN ADD<br />
D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
E<br />
<br />
the type lines following are assumed by dock unless overriden:<br />
T 1 positive<br />
T 2 negative<br />
T 3 acceptor<br />
T 4 donor<br />
T 5 ester_o<br />
T 6 amide_o<br />
T 7 neutral<br />
<br />
the following are the format statements for python for each line<br />
T %2d %8s\n<br />
M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n<br />
M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
M %77s\n<br />
M %77s\n<br />
M %77s\n<br />
A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
B %3d %3d %3d %-2s\n<br />
X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n<br />
R %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
C %6d %9d %9d\n<br />
S %6d %6d %3d %1d %1d %+11.3f\n<br />
S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n <br />
D %6d %6d %6d %3d %3d %3d\n<br />
D %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
E\n<br />
<br />
The following are the fortran77 format statements<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
1000 format(2x,i2,1x,a8)<br />
!M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters<br />
2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6)<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)<br />
!M smiles or longname<br />
2200 format(2x,a77)<br />
!A stuff about each atom, 1 per line<br />
3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,<br />
& f10.3,1x,f10.3,1x,f9.3)<br />
!B stuff about each bond, 1 per line<br />
4000 format(2x,i3,1x,i3,1x,i3,1x,a2)<br />
!X atomnum confnum x y z<br />
5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4)<br />
!R rigidnum color x y z<br />
6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4)<br />
!C confnum #startcoord #endcoord<br />
7000 format(2x,i6,1x,i9,1x,i9)<br />
!S setnum #lines #confs_total broken hydrogens omega_energy<br />
8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3)<br />
!S setnum linenum #confs confs [until full column]<br />
8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,<br />
& 1x,i6,1x,i6,1x,i6,1x,i6)<br />
!D CLUSID STARTSETX ENDSETXXX ADD MST MEN<br />
9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)<br />
!D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
!re-use 6000<br />
!E<br />
!E does not get a format line<br />
<br />
The following are Fortran95 format statements:<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
character (len=*), parameter :: DB2NAME = '(2x,i2,x,a8)' !1000<br />
!M zincname protname #atoms #bonds #xyz #confs #sets #rigid #maxmlines #clusters<br />
character (len=*), parameter :: DB2M1 =<br />
& '(2x,a16,x,a9,x,i3,x,i3,x,i6,x,i6,x,i6,x,i6,x,i6,x,i6)' !2000<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
character (len=*), parameter :: DB2M2 =<br />
& '(2x,f9.4,x,f10.3,x,f10.3,x,f10.3,x,f9.3)' !2100<br />
!M smiles/longname/arbitrary<br />
character (len=*), parameter :: DB2M3 = '(2x,a78)' !2200<br />
!A stuff about each atom, 1 per line<br />
character (len=*), parameter :: DB2ATOM =<br />
& '(2x,i3,x,a4,x,a5,x,i2,x,i2,x,f9.4,x,f10.3,x,<br />
& f10.3,x,f10.3,x,f9.3)' !3000<br />
!B stuff about each bond, 1 per line<br />
character (len=*), parameter :: DB2BOND =<br />
& '(2x,i3,x,i3,x,i3,x,a2)' !4000<br />
!X coordnumx atomnum confnum x y z<br />
character (len=*), parameter :: DB2COORD =<br />
& '(2x,i9,x,i3,x,i6,x,f9.4,x,f9.4,x,f9.4)' !5000<br />
!R rigidnum color x y z<br />
character (len=*), parameter :: DB2RIGID =<br />
& '(2x,i6,x,i2,x,f9.4,x,f9.4,x,f9.4)' !6000<br />
!C confnum coordstart coordend<br />
character (len=*), parameter :: DB2CONF = '(2x,i6,x,i9,x,i9)' !7000<br />
!S setnum #lines #confs_total broken hydrogens omega_energy <br />
character (len=*), parameter :: DB2SET1 =<br />
& '(2x,i6,x,i6,x,i3,x,i1,x,i1,x,f11.3)' !8000<br />
!S setnum linenum #confs confs [until full column]<br />
character (len=*), parameter :: DB2SET2 =<br />
& '(2x,i6,x,i6,x,i1,x,i6,x,i6,x,i6,x,i6,<br />
& 1x,i6,x,i6,x,i6,x,i6)' !8100<br />
!D CLUSID STASET ENDSET ADD(ittional matching spheres count) MST(art) MEN(d)<br />
character (len=*), parameter :: DB2CLUSTER =<br />
& '(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)' !9000<br />
!D NUM CO x y z<br />
!reuse DB2RIGID<br />
!E<br />
!E does not get a format line <br />
<br />
[[Category:Formats]]</div>Teague Sterling