Mol2db2 Format 2: Difference between revisions

Revision as of 17:51, 15 April 2010

This page is a wishlist for features that would be nice for a new version of the flexibase file format to support.

Real Atom Types and Bond Information
Way to determine which mix-and-match conformations have clashes (and avoid trying them)
A place to store an internal energy for each possible conformation
Terminal hydrogen rotations??
Aliphatic ring movements?
support for clusters of conformations
group tagging (needed for covalent docking) and basic set of covalent groups
specified rigid component override (and better rules for finding non-ring rigid components)
per molecule pKa

the following represents the current plan for the file format

M molecule (only 2 lines ever)
A atoms
B bond
X xyz
G group
GC group-conf mapping
C conformation
S sets

M zincname protname #atoms #bonds #xyz #groups #confs #sets 
M charge polar_solv apolar_solv surface_area total_solv
A stuff about each atom, 1 per line 
B stuff about each bond, 1 per line
X atomnum confnum x y z 
G groupnum #lines #children_total
G groupnum linenum #children childgroup [until column is full]
GC groupnum #lines #confs_total  
GC groupnum linenum #confs confnum [until column is full]
C confnum #lines #children_total
C confnum linenum #children childconf [until column is full]
S #confnums confnum [more] data_about_this_conf

With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.

notes: 17 children groups per line in current scheme.

01234567890123456789012345678901234567890123456789012345678901234567890123456789
M ZINCCODEX PROTCODEX ATO BON XYZXXX GRO CONFSX SETSXXXXX
M +CHA.RGEX +POLAR.SOL +APOLA.SOL SURFA.REA +TOTAL.SOL
A NUM NAME TYPEX +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA
B NUM ATO ATO TYPE
X ATO CONFNU +XCO.OORD +YCO.ORD +ZCO.ORD
G GRO #LIN #CH
G GRO LIN #C CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN CGN

Mol2db2 Format 2: Difference between revisions

Revision as of 17:51, 15 April 2010

Navigation menu

Search