Mol2db2 Format 2: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
 
(26 intermediate revisions by 3 users not shown)
Line 1: Line 1:
This page is a wishlist for features that would be nice for a new version of the flexibase file format to support. mol2db2 format features that are actually implemented so far are marked [x]
This page is a wishlist for features that would be nice for a new version of the flexibase file format to support. mol2db2 format features that are actually implemented so far are marked [x]


*Real Atom Types and Bond Information
= New Features =
== implemented ==
*Real Atom Types and Bond Information [x]
*Way to determine which mix-and-match conformations have clashes (and avoid trying them) [x]
*Way to determine which mix-and-match conformations have clashes (and avoid trying them) [x]
*A place to store an internal energy for each possible conformation [x]
*A place to store an internal energy for each possible conformation [x]
*Terminal hydrogen rotations?? [x]
*Terminal hydrogen rotations?? [x]
*support for clusters of conformations [x]
*arbitrary information to be written into output mol2 file (5th and above M lines) [x]
== wished ==
*Per-conformation per-atom partial charge & solvation information to support internal energies
*Aliphatic ring movements?
*Aliphatic ring movements?
*support for clusters of conformations
*group tagging (needed for covalent docking) and basic set of covalent groups
*group tagging (needed for covalent docking) and basic set of covalent groups
*specified rigid component override (and better rules for finding non-ring rigid components)
*specified rigid component override (and better rules for finding non-ring rigid components)
*per molecule pKa
*per molecule pKa
*valence for each atom
== Nomenclature Definitions ==
* Conf - one set of atoms that moves together with a single position per atom.
* Set - a group of conformations that completely defines one position for each atom in a ligand.
* Cluster - Not yet implamented in DOCK3.7
* Cloud - Not yet implamented in DOCK3.7


the following represents the current plan for the file format
= File Format =
==current plan for the file format ==
*T type information (implicitly assumed)
*T type information (implicitly assumed)
*M molecule (only 5 lines ever)
*M molecule (4 lines req'd, after that they are optional, 24 lines max)
*A atoms
*A atoms
*B bond
*B bond
*X xyz
*X xyz  
*G group
*R rigid xyz for matching (can actually be any xyzs)
*D group-conf mapping
*C conformation
*C conformation
*S sets
*S sets
*D clusters
*E end of molecule
*E end of molecule


  T ## namexxxx (implicitly assumed to be the standard 7)
  T ## namexxxx (implicitly assumed to be the standard 7)
  M zincname protname #atoms #bonds #xyz #groups #confs #sets  
  M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters
  M charge polar_solv apolar_solv total_solv surface_area
  M charge polar_solv apolar_solv total_solv surface_area
  M smiles
  M smiles
  M longname
  M longname
  M best_dud_energy
  [M arbitrary information preserved for writing out]
  A stuff about each atom, 1 per line  
  A stuff about each atom, 1 per line  
  B stuff about each bond, 1 per line
  B stuff about each bond, 1 per line
  X coordnum atomnum confnum x y z  
  X coordnum atomnum confnum x y z  
  G groupnum #lines #children_total
  R rigidnum color x y z
G groupnum linenum #children childgroup [until column is full]
  C confnum coordstart coordend
D groupnum #lines #confs_total 
  S setnum #lines #confs_total broken hydrogens omega_energy
D groupnum linenum #confs confnum [until column is full]
  C confnum #lines #children_total coordstart coordend
C confnum linenum #children childconf [until column is full]
  S setnum #lines #confs_total [INPUT|MIX] broken hydrogens omega_energy
  S setnum linenum #confs confs [until full column]
  S setnum linenum #confs confs [until full column]
D clusternum setstart setend matchstart matchend #additionalmatching
D matchnum color x y z
  E  
  E  


Line 55: Line 68:
  01234567890123456789012345678901234567890123456789012345678901234567890123456789
  01234567890123456789012345678901234567890123456789012345678901234567890123456789
  T ## typename
  T ## typename
  M ZINCCODEX PROTCODEX ATO BON XYZXXX GRO CONFSX SETSXXXXX
  M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU
  M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
  M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
  M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  M +BES.TDUD
  [M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
  A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
  A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
  B NUM ATO ATO TY
  B NUM ATO ATO TY
  X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX
  X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX
  G GRO #LI #CH
  R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX
G GRO LIN #C CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN
  C CONFNO COORDSTAR COORDENDX
D GRO #LIN #CONFS
  S SETIDX #LINES #CO C H +ENERGY.XXX
D GRO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS 
  S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS
  C CONFNO #LIN #CONFS COORDSTAR COORDENDX
D CLUSID STASET ENDSET MST MEN ADD
C CONFNO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS
D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX
  S SETIDXXXX #LINES #CO I C H +ENERGY.XXX
  S SETIDXXXX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS
  E
  E


Line 84: Line 95:
the following are the format statements for python for each line
the following are the format statements for python for each line
  T %2d %8s\n
  T %2d %8s\n
  M %9s %9s %3d %3d %6d %3d %6d %9d\n
  M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n
  M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n
  M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n
  M %77s\n
  M %77s\n
  M %77s\n
  M %77s\n
  M %+10.4f\n
  M %77s\n
  A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n
  A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n
  B %3d %3d %3d %-2s\n
  B %3d %3d %3d %-2s\n
  X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n
  X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n
  G %3d %3d %3d\n
  R %3d %2d %+9.4f %+9.4f %+9.4f\n
G %3d %3d %2d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d\n
  C %6d %9d %9d\n
  D %3d %4d %6d\n
  S %6d %6d %3d %1d %1d %+11.3f\n
  D %3d %4d %1d %6d %6d %6d %6d %6d %6d %6d %6d %6d\n
  S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n  
  C %6d %4d %6d %9d %9d\n
  D %6d %6d %6d %3d %3d %3d\n
C %6d %4d %1d %6d %6d %6d %6d %6d %6d %6d %6d %6d\n
  D %3d %2d %+9.4f %+9.4f %+9.4f\n
  S %9d %6d %3d %1d %1d %1d %+11.3f\n
  S %9d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n  
  E\n
  E\n


The following are the fortran format statements
The following are the fortran77 format statements


  !T ## namexxxx (implicitly assumed to be the standard 7)
  !T ## namexxxx (implicitly assumed to be the standard 7)
  1000 format(2x,i2,1x,a8)
  1000 format(2x,i2,1x,a8)
  !M zincname protname #atoms #bonds #xyz #groups #confs #sets
  !M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters
  2000 format(2x,a9,1x,a9,1x,i3,1x,i3,1x,i6,1x,i3,1x,i6,1x,i9)
  2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6)
  !M charge polar_solv apolar_solv total_solv surface_area
  !M charge polar_solv apolar_solv total_solv surface_area
  2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)
  2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)
  !M smiles or longname
  !M smiles or longname
  2200 format(2x,a77)
  2200 format(2x,a77)
!M best dud energy (from old version of dock)
2300 format(2x,f10.4)
  !A stuff about each atom, 1 per line
  !A stuff about each atom, 1 per line
  3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,
  3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,
Line 120: Line 127:
  4000 format(2x,i3,1x,i3,1x,i3,1x,a2)
  4000 format(2x,i3,1x,i3,1x,i3,1x,a2)
  !X atomnum confnum x y z
  !X atomnum confnum x y z
  5000 format(2x,i9,1x,i3,1x,i6,f9.4,1x,f9.4,1x,f9.4)
  5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4)
  !G groupnum #lines #children_total
  !R rigidnum color x y z
  6000 format(2x,i3,1x,i3,1x,i3)
  6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4)
!G groupnum linenum #children childgroup [until column is full]
  !C confnum #startcoord #endcoord
6100 format(2x,i3,1x,i3,1x,i2,1x,i3,1x,i3,1x,i3,1x,i3,1x,i3,
  7000 format(2x,i6,1x,i9,1x,i9)
    &      1x,i3,1x,i3,1x,i3,1x,i3,1x,i3,1x,i3,1x,i3,
  !S setnum #lines #confs_total broken hydrogens omega_energy
    &      1x,i3,1x,i3,1x,i3,1x,i3,1x,i3,1x,i3,1x,i3)
  8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3)
!D groupnum #lines #confs_total
7000 format(2x,i3,1x,i4,1x,i6)
!D groupnum linenum #confs confnum [until column is full]
7100 format(2x,i3,1x,i4,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,1x,i6,
    &      1x,i6,1x,i6,1x,i6,1x,i6)
  !C confnum #lines #children_total
  8000 format(2x,i6,1x,i4,1x,i6,1x,i9,1x,i9)
!C confnum linenum #children childconf [until column is full]
8100 format(2x,i6,1x,i4,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,1x,i6,
    &      1x,i6,1x,i6,1x,i6,1x,i6)
  !S setnum #lines #confs_total [INPUT|MIX] broken omega_energy
  9000 format(2x,i9,1x,i6,1x,i3,1x,i1,1x,i1,1x,i1,1x,f11.3)
  !S setnum linenum #confs confs [until full column]
  !S setnum linenum #confs confs [until full column]
  9100 format(2x,i9,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,
  8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,
     &      1x,i6,1x,i6,1x,i6,1x,i6)
     &      1x,i6,1x,i6,1x,i6,1x,i6)
!D CLUSID STARTSETX ENDSETXXX ADD MST MEN
9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)
!D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX
!re-use 6000
  !E
  !E
  !E does not get a format line
  !E does not get a format line


[[Category:Wishlists]]
The following are Fortran95 format statements:
 
!T ## namexxxx (implicitly assumed to be the standard 7)
      character (len=*), parameter :: DB2NAME = '(2x,i2,x,a8)' !1000
!M zincname protname #atoms #bonds #xyz #confs #sets #rigid #maxmlines #clusters
      character (len=*), parameter :: DB2M1 =
      &    '(2x,a16,x,a9,x,i3,x,i3,x,i6,x,i6,x,i6,x,i6,x,i6,x,i6)' !2000
!M charge polar_solv apolar_solv total_solv surface_area
      character (len=*), parameter :: DB2M2 =
      &    '(2x,f9.4,x,f10.3,x,f10.3,x,f10.3,x,f9.3)' !2100
!M smiles/longname/arbitrary
      character (len=*), parameter :: DB2M3 = '(2x,a78)' !2200
!A stuff about each atom, 1 per line
      character (len=*), parameter :: DB2ATOM =
      &    '(2x,i3,x,a4,x,a5,x,i2,x,i2,x,f9.4,x,f10.3,x,
      &    f10.3,x,f10.3,x,f9.3)' !3000
!B stuff about each bond, 1 per line
      character (len=*), parameter :: DB2BOND =
      &    '(2x,i3,x,i3,x,i3,x,a2)' !4000
!X coordnumx atomnum confnum x y z
      character (len=*), parameter :: DB2COORD =
      &    '(2x,i9,x,i3,x,i6,x,f9.4,x,f9.4,x,f9.4)' !5000
!R rigidnum color x y z
      character (len=*), parameter :: DB2RIGID =
      &    '(2x,i6,x,i2,x,f9.4,x,f9.4,x,f9.4)' !6000
!C confnum coordstart coordend
      character (len=*), parameter :: DB2CONF = '(2x,i6,x,i9,x,i9)' !7000
!S setnum #lines #confs_total broken hydrogens omega_energy
      character (len=*), parameter :: DB2SET1 =
      &    '(2x,i6,x,i6,x,i3,x,i1,x,i1,x,f11.3)' !8000
!S setnum linenum #confs confs [until full column]
      character (len=*), parameter :: DB2SET2 =
      &    '(2x,i6,x,i6,x,i1,x,i6,x,i6,x,i6,x,i6,
      &    1x,i6,x,i6,x,i6,x,i6)' !8100
!D CLUSID STASET ENDSET ADD(ittional matching spheres count) MST(art) MEN(d)
      character (len=*), parameter :: DB2CLUSTER =
      &    '(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)' !9000
!D NUM CO x y z
!reuse DB2RIGID
!E
!E does not get a format line
 
[[Category:Formats]]

Latest revision as of 15:44, 23 October 2014

This page is a wishlist for features that would be nice for a new version of the flexibase file format to support. mol2db2 format features that are actually implemented so far are marked [x]

New Features

implemented

  • Real Atom Types and Bond Information [x]
  • Way to determine which mix-and-match conformations have clashes (and avoid trying them) [x]
  • A place to store an internal energy for each possible conformation [x]
  • Terminal hydrogen rotations?? [x]
  • support for clusters of conformations [x]
  • arbitrary information to be written into output mol2 file (5th and above M lines) [x]

wished

  • Per-conformation per-atom partial charge & solvation information to support internal energies
  • Aliphatic ring movements?
  • group tagging (needed for covalent docking) and basic set of covalent groups
  • specified rigid component override (and better rules for finding non-ring rigid components)
  • per molecule pKa
  • valence for each atom

Nomenclature Definitions

  • Conf - one set of atoms that moves together with a single position per atom.
  • Set - a group of conformations that completely defines one position for each atom in a ligand.
  • Cluster - Not yet implamented in DOCK3.7
  • Cloud - Not yet implamented in DOCK3.7

File Format

current plan for the file format

  • T type information (implicitly assumed)
  • M molecule (4 lines req'd, after that they are optional, 24 lines max)
  • A atoms
  • B bond
  • X xyz
  • R rigid xyz for matching (can actually be any xyzs)
  • C conformation
  • S sets
  • D clusters
  • E end of molecule
T ## namexxxx (implicitly assumed to be the standard 7)
M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters
M charge polar_solv apolar_solv total_solv surface_area
M smiles
M longname
[M arbitrary information preserved for writing out]
A stuff about each atom, 1 per line 
B stuff about each bond, 1 per line
X coordnum atomnum confnum x y z 
R rigidnum color x y z
C confnum coordstart coordend
S setnum #lines #confs_total broken hydrogens omega_energy
S setnum linenum #confs confs [until full column]
D clusternum setstart setend matchstart matchend #additionalmatching
D matchnum color x y z
E 

With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.

notes: 17 children groups/group per line in current scheme. 9 children confs/group per line. 9 children confs/conf per line. 8 confs/set per line. groups/confs with no children are written out.

on the atom line, dt is dock type and co is color.

          1         2         3         4         5         6         7
01234567890123456789012345678901234567890123456789012345678901234567890123456789
T ## typename
M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU
M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
B NUM ATO ATO TY
X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX
R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX
C CONFNO COORDSTAR COORDENDX
S SETIDX #LINES #CO C H +ENERGY.XXX
S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS
D CLUSID STASET ENDSET MST MEN ADD
D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX
E

the type lines following are assumed by dock unless overriden:

T  1 positive
T  2 negative
T  3 acceptor
T  4 donor
T  5 ester_o
T  6 amide_o
T  7 neutral

the following are the format statements for python for each line

T %2d %8s\n
M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n
M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n
M %77s\n
M %77s\n
M %77s\n
A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n
B %3d %3d %3d %-2s\n
X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n
R %3d %2d %+9.4f %+9.4f %+9.4f\n
C %6d %9d %9d\n
S %6d %6d %3d %1d %1d %+11.3f\n
S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n 
D %6d %6d %6d %3d %3d %3d\n
D %3d %2d %+9.4f %+9.4f %+9.4f\n
E\n

The following are the fortran77 format statements

!T ## namexxxx (implicitly assumed to be the standard 7)
1000 format(2x,i2,1x,a8)
!M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters
2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6)
!M charge polar_solv apolar_solv total_solv surface_area
2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)
!M smiles or longname
2200 format(2x,a77)
!A stuff about each atom, 1 per line
3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,
    &       f10.3,1x,f10.3,1x,f9.3)
!B stuff about each bond, 1 per line
4000 format(2x,i3,1x,i3,1x,i3,1x,a2)
!X atomnum confnum x y z
5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4)
!R rigidnum color x y z
6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4)
!C confnum #startcoord #endcoord
7000 format(2x,i6,1x,i9,1x,i9)
!S setnum #lines #confs_total broken hydrogens omega_energy
8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3)
!S setnum linenum #confs confs [until full column]
8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,
    &       1x,i6,1x,i6,1x,i6,1x,i6)
!D CLUSID STARTSETX ENDSETXXX ADD MST MEN
9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)
!D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX
!re-use 6000
!E
!E does not get a format line

The following are Fortran95 format statements:

!T ## namexxxx (implicitly assumed to be the standard 7)
      character (len=*), parameter :: DB2NAME = '(2x,i2,x,a8)' !1000
!M zincname protname #atoms #bonds #xyz #confs #sets #rigid #maxmlines #clusters
      character (len=*), parameter :: DB2M1 =
     &    '(2x,a16,x,a9,x,i3,x,i3,x,i6,x,i6,x,i6,x,i6,x,i6,x,i6)' !2000
!M charge polar_solv apolar_solv total_solv surface_area
      character (len=*), parameter :: DB2M2 =
     &    '(2x,f9.4,x,f10.3,x,f10.3,x,f10.3,x,f9.3)' !2100
!M smiles/longname/arbitrary
      character (len=*), parameter :: DB2M3 = '(2x,a78)' !2200
!A stuff about each atom, 1 per line
      character (len=*), parameter :: DB2ATOM =
     &    '(2x,i3,x,a4,x,a5,x,i2,x,i2,x,f9.4,x,f10.3,x,
     &    f10.3,x,f10.3,x,f9.3)' !3000
!B stuff about each bond, 1 per line
     character (len=*), parameter :: DB2BOND =
     &    '(2x,i3,x,i3,x,i3,x,a2)' !4000
!X coordnumx atomnum confnum x y z
      character (len=*), parameter :: DB2COORD =
     &    '(2x,i9,x,i3,x,i6,x,f9.4,x,f9.4,x,f9.4)' !5000
!R rigidnum color x y z
      character (len=*), parameter :: DB2RIGID =
     &    '(2x,i6,x,i2,x,f9.4,x,f9.4,x,f9.4)' !6000
!C confnum coordstart coordend
      character (len=*), parameter :: DB2CONF = '(2x,i6,x,i9,x,i9)' !7000
!S setnum #lines #confs_total broken hydrogens omega_energy 
      character (len=*), parameter :: DB2SET1 =
     &    '(2x,i6,x,i6,x,i3,x,i1,x,i1,x,f11.3)' !8000
!S setnum linenum #confs confs [until full column]
      character (len=*), parameter :: DB2SET2 =
     &    '(2x,i6,x,i6,x,i1,x,i6,x,i6,x,i6,x,i6,
     &    1x,i6,x,i6,x,i6,x,i6)' !8100
!D CLUSID STASET ENDSET ADD(ittional matching spheres count) MST(art) MEN(d)
      character (len=*), parameter :: DB2CLUSTER =
     &    '(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)' !9000
!D NUM CO x y z
!reuse DB2RIGID
!E
!E does not get a format line