Mol2db2 Format 2: Difference between revisions

From DISI
Jump to navigation Jump to search
m (sets changed)
m (done for now)
Line 17: Line 17:
*X xyz   
*X xyz   
*G group
*G group
*GC group-conf mapping
*D group-conf mapping
*C conformation
*C conformation
*S sets
*S sets
Line 28: Line 28:
  G groupnum #lines #children_total
  G groupnum #lines #children_total
  G groupnum linenum #children childgroup [until column is full]
  G groupnum linenum #children childgroup [until column is full]
  GC groupnum #lines #confs_total   
  D groupnum #lines #confs_total   
  GC groupnum linenum #confs confnum [until column is full]
  D groupnum linenum #confs confnum [until column is full]
  C confnum #lines #children_total
  C confnum #lines #children_total
  C confnum linenum #children childconf [until column is full]
  C confnum linenum #children childconf [until column is full]
  S setnum #lines #confs total data...
  S setnum #lines #confs_total [INPUT|MIX] broken omega_energy
  S setnum linenum #confs_total confs [until full column]
  S setnum linenum #confs confs [until full column]


With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.
With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.
Line 42: Line 42:
8 confs/set per line.
8 confs/set per line.


          1        2        3        4        5        6        7
  01234567890123456789012345678901234567890123456789012345678901234567890123456789
  01234567890123456789012345678901234567890123456789012345678901234567890123456789
  M ZINCCODEX PROTCODEX ATO BON XYZXXX GRO CONFSX SETSXXXXX
  M ZINCCODEX PROTCODEX ATO BON XYZXXX GRO CONFSX SETSXXXXX
Line 50: Line 51:
  G GRO #LI #CH
  G GRO #LI #CH
  G GRO LIN #C CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN
  G GRO LIN #C CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN
  GC GRO #LIN #CONFS
  D GRO #LIN #CONFS
  GC GRO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS   
  D GRO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS   
  C CONFNO #LIN #CONFS
  C CONFNO #LIN #CONFS
  C CONFNO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS
  C CONFNO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS
  S SETIDXXXX #LINES #CONFS DATA
  S SETIDXXXX #LINES #CONFS I C +ENER.GYX
  S SETIDXXXX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS
  S SETIDXXXX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS


[[Category:Wishlists]]
[[Category:Wishlists]]

Revision as of 18:08, 15 April 2010

This page is a wishlist for features that would be nice for a new version of the flexibase file format to support.

  • Real Atom Types and Bond Information
  • Way to determine which mix-and-match conformations have clashes (and avoid trying them)
  • A place to store an internal energy for each possible conformation
  • Terminal hydrogen rotations??
  • Aliphatic ring movements?
  • support for clusters of conformations
  • group tagging (needed for covalent docking) and basic set of covalent groups
  • specified rigid component override (and better rules for finding non-ring rigid components)
  • per molecule pKa

the following represents the current plan for the file format

  • M molecule (only 2 lines ever)
  • A atoms
  • B bond
  • X xyz
  • G group
  • D group-conf mapping
  • C conformation
  • S sets
M zincname protname #atoms #bonds #xyz #groups #confs #sets 
M charge polar_solv apolar_solv surface_area total_solv
A stuff about each atom, 1 per line 
B stuff about each bond, 1 per line
X atomnum confnum x y z 
G groupnum #lines #children_total
G groupnum linenum #children childgroup [until column is full]
D groupnum #lines #confs_total  
D groupnum linenum #confs confnum [until column is full]
C confnum #lines #children_total
C confnum linenum #children childconf [until column is full]
S setnum #lines #confs_total [INPUT|MIX] broken omega_energy
S setnum linenum #confs confs [until full column]

With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.

notes: 17 children groups/group per line in current scheme. 9 children confs/group per line. 9 children confs/conf per line. 8 confs/set per line.

          1         2         3         4         5         6         7
01234567890123456789012345678901234567890123456789012345678901234567890123456789
M ZINCCODEX PROTCODEX ATO BON XYZXXX GRO CONFSX SETSXXXXX
M +CHA.RGEX +POLAR.SOL +APOLA.SOL SURFA.REA +TOTAL.SOL
A NUM NAME TYPEX +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
B NUM ATO ATO TYPE
X ATO CONFNU +XCO.OORD +YCO.ORD +ZCO.ORD
G GRO #LI #CH
G GRO LIN #C CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN
D GRO #LIN #CONFS
D GRO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS  
C CONFNO #LIN #CONFS
C CONFNO LINE # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS
S SETIDXXXX #LINES #CONFS I C +ENER.GYX
S SETIDXXXX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS