Mol2db2 format
Jump to navigation
Jump to search
mol2db format describes files created by the mol2db2 program for input into the DOCK 3.7 molecular docking program.
mol2db2 format was designed by Ryan Coleman as part of his postdoctoral research in the Shoichet Lab.
It was first introduced with DOCK 3.7 and is not compatible with previous versions of the DOCK series such as DOCK 3.6
File format description
Mol2db2 Format 2 File Format T type information (implicitly assumed) M molecule (4 lines req'd, after that they are optional, 24 lines max) A atoms B bond X xyz R rigid xyz for matching (can actually be any xyzs) C conformation S sets D clusters E end of molecule T ## namexxxx (implicitly assumed to be the standard 7) M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters M charge polar_solv apolar_solv total_solv surface_area M smiles M longname [M arbitrary information preserved for writing out] A stuff about each atom, 1 per line B stuff about each bond, 1 per line X coordnum atomnum confnum x y z R rigidnum color x y z C confnum coordstart coordend S setnum #lines #confs_total broken hydrogens omega_energy S setnum linenum #confs confs [until full column] D clusternum setstart setend matchstart matchend #additionalmatching D matchnum color x y z E With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format. notes: 17 children groups/group per line in current scheme. 9 children confs/group per line. 9 children confs/conf per line. 8 confs/set per line. groups/confs with no children are written out. on the atom line, dt is dock type and co is color. 1 2 3 4 5 6 7 01234567890123456789012345678901234567890123456789012345678901234567890123456789 T ## typename M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX [M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX] A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA B NUM ATO ATO TY X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX C CONFNO COORDSTAR COORDENDX S SETIDX #LINES #CO C H +ENERGY.XXX S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS D CLUSID STASET ENDSET MST MEN ADD D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX E the type lines following are assumed by dock unless overriden: T 1 positive T 2 negative T 3 acceptor T 4 donor T 5 ester_o T 6 amide_o T 7 neutral the following are the format statements for python for each line T %2d %8s\n M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n M %77s\n M %77s\n M %77s\n A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n B %3d %3d %3d %-2s\n X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n R %3d %2d %+9.4f %+9.4f %+9.4f\n C %6d %9d %9d\n S %6d %6d %3d %1d %1d %+11.3f\n S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n D %6d %6d %6d %3d %3d %3d\n D %3d %2d %+9.4f %+9.4f %+9.4f\n E\n The following are the fortran77 format statements !T ## namexxxx (implicitly assumed to be the standard 7) 1000 format(2x,i2,1x,a8) !M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters 2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6) !M charge polar_solv apolar_solv total_solv surface_area 2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3) !M smiles or longname 2200 format(2x,a77) !A stuff about each atom, 1 per line 3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x, & f10.3,1x,f10.3,1x,f9.3) !B stuff about each bond, 1 per line 4000 format(2x,i3,1x,i3,1x,i3,1x,a2) !X atomnum confnum x y z 5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4) !R rigidnum color x y z 6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4) !C confnum #startcoord #endcoord 7000 format(2x,i6,1x,i9,1x,i9) !S setnum #lines #confs_total broken hydrogens omega_energy 8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3) !S setnum linenum #confs confs [until full column] 8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6, & 1x,i6,1x,i6,1x,i6,1x,i6) !D CLUSID STARTSETX ENDSETXXX ADD MST MEN 9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3) !D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX !re-use 6000 !E !E does not get a format line The following are Fortran95 format statements: !T ## namexxxx (implicitly assumed to be the standard 7) character (len=*), parameter :: DB2NAME = '(2x,i2,x,a8)' !1000 !M zincname protname #atoms #bonds #xyz #confs #sets #rigid #maxmlines #clusters character (len=*), parameter :: DB2M1 = & '(2x,a16,x,a9,x,i3,x,i3,x,i6,x,i6,x,i6,x,i6,x,i6,x,i6)' !2000 !M charge polar_solv apolar_solv total_solv surface_area character (len=*), parameter :: DB2M2 = & '(2x,f9.4,x,f10.3,x,f10.3,x,f10.3,x,f9.3)' !2100 !M smiles/longname/arbitrary character (len=*), parameter :: DB2M3 = '(2x,a78)' !2200 !A stuff about each atom, 1 per line character (len=*), parameter :: DB2ATOM = & '(2x,i3,x,a4,x,a5,x,i2,x,i2,x,f9.4,x,f10.3,x, & f10.3,x,f10.3,x,f9.3)' !3000 !B stuff about each bond, 1 per line character (len=*), parameter :: DB2BOND = & '(2x,i3,x,i3,x,i3,x,a2)' !4000 !X coordnumx atomnum confnum x y z character (len=*), parameter :: DB2COORD = & '(2x,i9,x,i3,x,i6,x,f9.4,x,f9.4,x,f9.4)' !5000 !R rigidnum color x y z character (len=*), parameter :: DB2RIGID = & '(2x,i6,x,i2,x,f9.4,x,f9.4,x,f9.4)' !6000 !C confnum coordstart coordend character (len=*), parameter :: DB2CONF = '(2x,i6,x,i9,x,i9)' !7000 !S setnum #lines #confs_total broken hydrogens omega_energy character (len=*), parameter :: DB2SET1 = & '(2x,i6,x,i6,x,i3,x,i1,x,i1,x,f11.3)' !8000 !S setnum linenum #confs confs [until full column] character (len=*), parameter :: DB2SET2 = & '(2x,i6,x,i6,x,i1,x,i6,x,i6,x,i6,x,i6, & 1x,i6,x,i6,x,i6,x,i6)' !8100 !D CLUSID STASET ENDSET ADD(ittional matching spheres count) MST(art) MEN(d) character (len=*), parameter :: DB2CLUSTER = & '(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)' !9000 !D NUM CO x y z !reuse DB2RIGID !E !E does not get a format line
http://i.creativecommons.org/l/by-sa/3.0/88x31.png
This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ This page is adapted from "DOCK3.7 Documentation" by Ryan G. Coleman. Based on a work at https://sites.google.com/site/dock37wiki/.