Mol2db2 Format 2: Difference between revisions

From DISI
Jump to navigation Jump to search
m (xyz python line)
m (even more python format lines)
Line 43: Line 43:
9 children confs/conf per line.
9 children confs/conf per line.
8 confs/set per line.
8 confs/set per line.
groups/confs with no children are written out.


on the atom line, dt is dock type and co is color.
on the atom line, dt is dock type and co is color.
Line 79: Line 80:
  B %3d %3d %3d %-2s\n
  B %3d %3d %3d %-2s\n
  X %3d %6d %+9.4f %+9.4f %+9.4f\n
  X %3d %6d %+9.4f %+9.4f %+9.4f\n
 
G %3d %3d %3d\n
   
G %3d %3d %2d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d\n
D %3d %4d %6d\n
  D %3d %4d %1d %6d %6d %6d %6d %6d %6d %6d %6d %6d\n


[[Category:Wishlists]]
[[Category:Wishlists]]

Revision as of 22:28, 16 April 2010

This page is a wishlist for features that would be nice for a new version of the flexibase file format to support.

  • Real Atom Types and Bond Information
  • Way to determine which mix-and-match conformations have clashes (and avoid trying them)
  • A place to store an internal energy for each possible conformation
  • Terminal hydrogen rotations??
  • Aliphatic ring movements?
  • support for clusters of conformations
  • group tagging (needed for covalent docking) and basic set of covalent groups
  • specified rigid component override (and better rules for finding non-ring rigid components)
  • per molecule pKa

the following represents the current plan for the file format

  • T type information (implicitly assumed)
  • M molecule (only 2 lines ever)
  • A atoms
  • B bond
  • X xyz
  • G group
  • D group-conf mapping
  • C conformation
  • S sets
T ## namexxxx (implicitly assumed to be the standard 7)
M zincname protname #atoms #bonds #xyz #groups #confs #sets 
M charge polar_solv apolar_solv total_solv surface_area
A stuff about each atom, 1 per line 
B stuff about each bond, 1 per line
X atomnum confnum x y z 
G groupnum #lines #children_total
G groupnum linenum #children childgroup [until column is full]
D groupnum #lines #confs_total  
D groupnum linenum #confs confnum [until column is full]
C confnum #lines #children_total
C confnum linenum #children childconf [until column is full]
S setnum #lines #confs_total [INPUT|MIX] broken omega_energy
S setnum linenum #confs confs [until full column]

With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.

notes: 17 children groups/group per line in current scheme. 9 children confs/group per line. 9 children confs/conf per line. 8 confs/set per line. groups/confs with no children are written out.

on the atom line, dt is dock type and co is color.

          1         2         3         4         5         6         7
01234567890123456789012345678901234567890123456789012345678901234567890123456789
T ## typename
M ZINCCODEX PROTCODEX ATO BON XYZXXX GRO CONFSX SETSXXXXX
M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
B NUM ATO ATO TY
X ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX
G GRO #LI #CH
G GRO LIN #C CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN
D GRO #LIN #CONFS
D GRO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS  
C CONFNO #LIN #CONFS
C CONFNO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS
S SETIDXXXX #LINES #CONFS I C +ENER.GYX
S SETIDXXXX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS

the type lines following are assumed by dock unless overriden:

T  1 positive
T  2 negative
T  3 acceptor
T  4    donor
T  5  ester_o
T  6  amide_o
T  7  neutral

the following are the format statements for python for each line

T %2d %8s\n
M %9s %9s %3d %3d %6d %3d %6d %9d\n
M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n
A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n
B %3d %3d %3d %-2s\n
X %3d %6d %+9.4f %+9.4f %+9.4f\n
G %3d %3d %3d\n
G %3d %3d %2d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d %3d\n
D %3d %4d %6d\n
D %3d %4d %1d %6d %6d %6d %6d %6d %6d %6d %6d %6d\n